This blog post by Path server engineer Neil Chintomby is part of a series in which our engineers discuss how we build some of our favorite features.
We wanted to build messaging to allow folks to communicate directly, either with a friend or family member or with a group of people. And we wanted them to be able to share the same creative content through messaging that they can share elsewhere in Path.
To build the feature, we took the four people who were working on messaging — one each from iOS, Android, server, and design — and put them in a room together until the feature shipped. It was really helpful to have everyone working in the same place because messaging was such a large cross-functional effort.
On the backend, we started out with a stock ejabberd server, which is an open source Erlang-based XMPP server, using the publish-subscribe XMPP extension (XEP-0060) and then customized this configuration to fit our feature and performance needs.
Out of the box, ejabberd comes with an mnesia database. We replaced mnesia with a Python-based API service which then talks to MongoDB. We had three main reasons for implementing this API service. First, we wanted to decouple the ejabberd servers from the datastore, giving us a layer of technical flexibility. We had learned from previous experience that having servers talking directly to MongoDB led to difficulties in scaling so we wanted to avoid those pitfalls when bringing up messaging. The second reason was for the API service to provide a layer of caching between ejabberd and MongoDB and give us more consistent timings when fetching data. The third reason was because we could cache many more items for a node with our own API service, instead of relying on ejabberd with mnesia which limits the number of items maintained for each node. Although this limit is configurable, being able to control our caching layer was likely to be less problematic.
The main purpose of the API layer is to provide a layer of caching on top of MongoDB. Instead of querying the database repeatedly for potentially unchanged data, the API layer enabled us to do the caching of nodes for which a given user is subscribed, and also caches items for a given node. When an item is published, it is written to the write-through cache on the API layer. The item is written to the items database, and then the corresponding node in the nodes database is updated with the new item information. The nodes database maintains a last modified timestamp per node as well as a cache of the most recent items in a node, stored as a blob of data to reduce unnecessary queries.
It was important to make sure that the API layer operated as efficiently as possible because one of our requirements was to be able to use messaging from multiple devices. If we only supported messaging on one device at any given time, it wouldn’t be as crucial to have a performant retrieval system or long-term message store. Because we decided to allow users to message from as many devices as they wanted, we had to make sure that we would be able to scale our long-term storage system and aggressively optimize message retrieval to access those messages.
Client Feature Requirements
One of the features we implemented in the client was the ability to scroll back through earlier messages, as it doesn’t make sense to store all of the messages in the history of the conversation locally on the client, especially for chatty conversations. The result set management XMPP extension (XEP-0059) didn’t quite meet our requirements because that was quantity-based and our implementation for pulling older messages was time-based. We decided to modify the items request to take `before` and `since` attributes. The client then uses these date attributes to specify the range of items to request, giving the client the flexibility to ask only for items that are necessary for display.
Another feature we implemented on the client was the ability for a user to set user-specific settings on a conversation. One example is muting, or suppressing notifications, for a conversation. To support this on the server side, we created a new type of request that the client could call specifically for getting and setting these user node settings, and then added new handlers on the ejabberd side to handle these new stanzas. Per-user per-node settings conceptually sound similar to subscription statuses in that there is a specific state for each user on each node, and in fact internally that information is actually stored in the same database document. However, overloading the subscribe request to get these client-specific settings did not sound ideal, so that’s why we decided to make a new set of requests to handle the application level features.
To minimize the number of calls the client makes to the messaging servers, we added last modified timestamps to subscriptions responses. First, we wanted to minimize get items requests so that we didn’t request items for subscriptions which hadn’t changed in months. We added a new last modified attribute to each subscription in the subscriptions response, indicating when an item was last published to that node, which the client then stores locally. The next time the client asks for subscriptions, if a last published timestamp for a node is newer than the timestamp that the client has, then the client will request new items from that node, which is another request where we can use the `since` attribute that we added to get items requests. This way we optimized the client to only make get items requests for nodes that actually have new content and reduced get items calls to the server.
Because we support messaging on multiple devices, a user could update their settings from one device, and then we would need some way to keep the other devices in sync. A basic approach would be to always ask for the user node settings for every subscription on every connect. However, this is extremely chatty, especially for information that is not likely to change often. To reduce these requests, we added a last modified timestamp for when the user last changed their settings for that node, to each subscription in the subscriptions response as well as both the get and set user node settings responses. The client then stores this timestamp locally, and when it sees that a subscription has a newer settings timestamp, only then will it make a get user node settings request to update to the new node settings.
On the iOS client, we were already using Core Data as our local store, so we added new message and conversation entities to the model. To handle the XMPP side of things, we used XMPPFramework (https://github.com/robbiehanson/XMPPFramework). Choosing to use a readily-available XMPP framework allowed us to get up and running quickly and spend more time focusing on implementing features and making performance optimizations.
Optimizing Sent Message Display
We spent time thinking about how to increase the responsiveness of the message-sending experience and came up with the idea of pending messages. Our initial implementation of sending messages used a straightforward fetched results controller approach, which watched for new messages added to a conversation and then displayed them. So, if the user typed a message and pushed the send button, the message object would be created and saved on a separate serial queue used for writing to the database. The main thread would then hear about the change, merge the changes in, and then finally the fetched results controller would get a callback about the new message object and the sent message would appear in the conversation’s table view. Using this basic initial approach on an iPhone 4 running iOS 5.0.1, it took on average 350 milliseconds for a sent text message to appear in the sender’s thread.
We saw room for improvement here and experimented with the idea of having a “pending message” to display in the conversation. Instead of waiting for the database write queue to create and save the message object, and then wait for it to propagate back to the fetched results controller, we create a message object on the main thread and add it directly to the conversation’s table view so that the message displays nearly immediately. After that, the real version of the message object gets created and saved on the write queue, as expected. Eventually when the fetched results controller hears about the new real message, it checks that message against the list of pending messages. If it finds a real message match for a pending message, that message cell’s backing message gets transparently swapped out from the pending message to the real message. With the pending message approach on an iPhone 4 running iOS 5.0.1, it took on average 35 milliseconds for a sent text message to appear in the sender’s thread, an order of magnitude faster than the initial approach.
Multi-step Photo Messaging
We took performance optimizations even further with photo messages. There are multiple steps involved in sending photo messages. When the user is at the photo editing screen, all of the edits, such as filters, are being previewed on a smaller sized version of the full image for performance. Once the user presses the send button, we apply the edits to the full size image and generate two thumbnails, a colour thumbnail and a low resolution greyscale thumbnail. The colour thumbnail is used for displaying the photo message on the sender’s side, and the greyscale thumbnail is attached to the message that gets sent to the recipient. The greyscale thumbnail acts as a placeholder photo message-of-intent for the receiver, while the full size edited photo is processed by our servers and then broadcast back down to the recipients with a colour thumbnail as well as the full size photo. Sending photos in many small steps means that the flow of conversation is not interrupted or held up by sending photos.
Optimizing Photo Sending
Our initial implementation of photo sending was as soon as the user pressed the send button for the photo, to create the colour and greyscale thumbnails, and also apply image edits onto the full size photo on a global queue. Once all of the image processing was finished, we would come back to the database write queue and create the photo message object, attach the thumbnails, and then send the photo message. However, even using pending messages, this basic approach took on average 4 seconds for a sent photo message to appear in the sender’s conversation on an iPhone 4 running iOS 5.0.1.
Almost all of that time was spent applying the image edits to the full size image, so it was clear we needed to defer the full size image processing. The approach we ended up going with is as soon as the user pressed the send button for the photo, we immediately create the small colour thumbnail and the low-resolution greyscale thumbnail. We then create the pending message as described earlier, using the colour thumbnail, and display the photo message immediately to the sender. Then we send off the message to the recipient with the greyscale thumbnail attached. Only after that do we dispatch the full size image processing to the global queue. After the image processing is finished, we create the real message object and send up the photo to our servers to be broadcast back down to the receivers. By deferring the full size image processing until after we create the message for the sender and fire off the low-res greyscale thumbnail to the receiver, it takes on average 250 milliseconds for a sent photo message to appear in the sender’s conversation on an iPhone 4 running iOS 5.0.1, again an order of magnitude in improvement, and almost all of that time being in creating the two thumbnails.
We are continuing to make messaging better, more performant, and more fun!
Want to help build our next big feature? Apply now at path.com/jobs.