SEEK Asia’s approach to go serverless, and a discovery along the way
In SEEK Asia we are constantly facing the challenge of managing the biggest 2 jobs classifieds platforms in SEA, namely JobStreet and JobsDB. With almost 20 years of legacy and 200+ engineering and product team, we certainly feel the need to consolidate our services, adopt new technologies, and build better software!
One of those challenges is consolidating and re-building the user re-engagement service. The goal of this project is to create a new platform to get the right content send it to the right person from the right channel at the right time. In other words, we want to build the platform that:
1- Decides who need a new job from our user base.
2- Then getting the best matching jobs for those users.
3- Then deciding which channel(mobile push, email, sms, etc) to send to.
4- Finally deciding when is the best time in a day to send that notification.
Knowing the technical complexity of the mission, and the fact that eventually this platform will serve tens of millions of monthly active users, we felt the need to look around for technologies that allow us to move fast, experiment, scale fast, and lower the maintenance overhead as much as possible.
After brainstorming we decided that Amazon lambda is a good option to start with combined with Amazon SQS, the interesting part however is the architecture of the services that we came up with:
The flow start with our data pipeline which compile and send the list of recipients, this list will be sent to matchingTransporterMailbox.
And then the data will be flowing between 3 main lambda functions as such:
1- matchingTransporter reads from it’s mailbox, and for each user it gets the matching jobs IDs from the Matching SAAS, then it sends the result to jobComposerMailbox.
2- jobComposer reads from it’s mailbox, and for each recommended job ID it fetches the data from relevant data stores, then it sends the result to deliveryMailbox.
3- delivery reads from it’s mailbox, and for each user it selects the notification template to be sent and compiling the JSON payload for that template.
Now, there is one trick, lambda can’t be invoked from SQS directly, so with each lambda we have another lambda in front scheduled via cloud watch to check if there is any messages in the mailbox and invoke a lambda instance for each message until the mailbox is empty, if there is no messages then it sleeps 5 min and revive again to check.
For sure we could have used SNS to invoke Lambda, but the team decided that SNS might add more uncontrollable costs, especially if we wanted to handle the dead letter queue.
Does this architecture reminds you of something?
At first i didn’t notice, but then it suddenly occurred to my mind how similar this to the actor-model!
From wikipedia:
an actor can: make local decisions, create more actors, send more messages, and determine how to respond to the next message received. Actors may modify private state, but can only affect each other through messages.
- Each lambda is an actor, and with each lambda we attached a mailbox.
- Invoker lambda can create more lambdas.
- Other lambdas read from mailbox, send to other lambda’s mailbox, and maintain internal state.
After reading The actor model in 10 minutes I felt that we might be onto something here, waiting for everyone to comment and let us know your thoughts.