Implementing a Tracking System for iOS with CoreData

Jun 22, 2020 • Emre Havan

As iOS developers, we often need to implement tracking in our applications. There are many third-party frameworks that would allow us to implement tracking systems in our projects. But in this article, we are going to talk about how we have implemented our custom tracking infrastructure at Freeletics with the help of CoreData, without using any third party framework.

Our system will save each event generated by users, store them temporarily, and once the number of stored events reaches the defined limit, all the events are sent to the server. The client side tracking infrastructure is composed of three main entities: storage, batcher and sender.

TrackingEventStorage: Responsible for storing, fetching and deleting events using CoreData.
TrackingEventsBatcher: Responsible for batching events and acting as a layer of communication between TrackingEventStorage and TrackingEventSender.
TrackingEventSender: Responsible for sending a list of events to the server.

For the event itself, we have two different models. One is to store them in CoreData as NSManagedObject (ManagedInHouseTrackingEvent) and the other one is as a simple struct (InHouseTrackingEvent) to easily initialise from the consumer side and later to send to the backend. Our models look like the following:

ManagedInHouseTrackingEvent:

@objc(ManagedInHouseTrackingEvent)
public final class ManagedInHouseTrackingEvent: NSManagedObject {

}

extension ManagedInHouseTrackingEvent {

    @nonobjc public class func fetchRequest() -> NSFetchRequest<ManagedInHouseTrackingEvent> {
        return NSFetchRequest<ManagedInHouseTrackingEvent>(entityName: String(describing: ManagedInHouseTrackingEvent.self))
    }

    @NSManaged public var name: String?
    @NSManaged public var properties: Data?
    @NSManaged public var id: String?
}

extension  ManagedInHouseTrackingEvent {
    enum PropertyKey: String {
        case id
        case name
        case properties
    }
}

InHouseTrackingEvent:

struct InHouseTrackingEvent {
    let id: String
    let name: String
    let properties: [String: Any]
}

We normally do not need an id property for our events, but we will use it later while creating core data event models so that we can distinguish persisted events from each other later on.

As you can see, the properties field is of type Data in our managed model, whereas it is a [String: Any] dictionary in InHouseTrackingEvent. Since we are just going to use managed models to persist data rather than manipulating any existing ones, we are just going to convert properties to Data to easily persist them as Binary Data with CoreData.

Event Storage Implementation

After creating our models, and also xcdatamodel related to ManagedInHouseTrackingEvent, now we will continue with the core data stack.

We need to have a core data stack, where our storage class can initialise the managed context from its persistent container. Later we use this managed context to initialise NSEntityDescription that will describe our entity, and to interact with all CRUD operations for the database.

InHouseTrackingCoreDataStack:

final class InHouseTrackingCoreDataStack {

    static let shared = InHouseTrackingCoreDataStack()
    private let containerName = "FreeleticsInHouseTracking"

    private init() {}

    lazy var persistentContainer: NSPersistentContainer = {
        let container = NSPersistentContainer(name: containerName)
        container.loadPersistentStores(completionHandler: { [weak self] (_, error) in
            if let self = self,
                let error = error as NSError? {
                print("Error!")
            }
        })
        return container
    }()
}

TrackingEventStorage:

Now we can create TrackingEventStorage class. It will have four properties:

entityName: Representing the class name for our core data model.
coreDataStack: A reference to our core data stack.
managedContext: An NSManagedObjectContext which will be used to wrap all CRUD operations for core data.
eventEntity: An NSEntityDescription representing our core data model.

final class TrackingEventsStorage {

    let managedContext: NSManagedObjectContext
    let eventEntity: NSEntityDescription?

    private let entityName = "ManagedInHouseTrackingEvent"
    private let coreDataStack = InHouseTrackingCoreDataStack.shared

    init() {
        managedContext = coreDataStack.persistentContainer.newBackgroundContext()
        eventEntity = NSEntityDescription.entity(forEntityName: entityName,
                                                 in: managedContext)
    }
}

When we initialise the managedContext by using newBackgroundContext() from the persistent container, it will have the concurrencyType of privateQueueConcurrencyType. We want to have a dedicated managedContext so that whenever a database operation is done within, it will make sure every operation is executed on the same queue. We need this since CoreData is not thread-safe by default [1]. This will later allow us to safely interact with the tracking system regardless of what thread we are on. Moreover, we will be executing all core data related code inside a performAndWait [2] closure of the managedContext. This will make sure all our operations will be executed synchronously. We need synchronicity since many of our actions will be depending on each other, such as making sure to check stored events after storing a new event.

We are going to implement three public methods for this class to interact with.

func storeEvent(_ event: InHouseTrackingEvent)
func removeEvents(_ events: [InHouseTrackingEvent])
func storedEvents(withMaximumAmountOf limit: Int?) -> [InHouseTrackingEvent]?

But before that we need to implement some private helper methods the public methods will benefit from.

First, we will need to implement a method to execute a given fetch request, which will perform the given request and return its results.

private func performFetchRequest(_ request: NSFetchRequest<NSFetchRequestResult>) -> [NSManagedObject]? {
    var objects: [NSManagedObject]?

    managedContext.performAndWait {
        do {
            objects = try managedContext.fetch(request) as? [NSManagedObject]
        } catch {
            print("Error!")
        }
    }
    return objects
}

We also need a method to create a fetch request to perform, which will have two parameters:

identifiers: An optional array of identifiers to look for.
limit: An optional integer to set the limit of the fetch request.

private func makeFetchRequest(withIDs identifiers: [String]? = nil,
                              withMaximumAmountOf limit: Int? = nil) -> NSFetchRequest<NSFetchRequestResult> {
    let request = NSFetchRequest<NSFetchRequestResult>(entityName: entityName)
    if let identifiers = identifiers {
        request.predicate = NSPredicate(format: "id IN %@", identifiers)
    }
    if let limit = limit {
        request.fetchLimit = limit
    }
    return request
}

Next, we will implement the coreDataObjects method which will be retrieving stored NSManagedObjects with two parameters:

identifiers: An optional array of identifiers to look for.
limit: An integer to set the limit of the fetch request. and by calling both the makeFetchRequest and performFetchRequest methods.

private func coreDataObjects(withIDs identifiers: [String]? = nil,
                             withMaximumAmountOf limit: Int? = nil) -> [NSManagedObject]? {
    let request = makeFetchRequest(withIDs: identifiers,
                                   withMaximumAmountOf: limit)

    return performFetchRequest(request)
}

Another component we are going to need is a method to get InHouseTrackingEvent events from stored managed object events before providing those to upper-level APIs. We are going to create a factory class with makeEvent method for it as following:

final class InHouseTrackingEventFactory {

    typealias Keys = ManagedInHouseTrackingEvent.PropertyKey

    /// Initializes and returns an `InHouseTrackingEvent` from the given NSManagedObject
    /// - Returns: Returns an InHouseTrackingEvent from NSManagedObject or nil if any error occurs
    static func makeEvent(from object: NSManagedObject) -> InHouseTrackingEvent? {
        do {
            guard let propertiesData = object.value(forKey: Keys.properties.rawValue) as? Data,
                let properties = try JSONSerialization.jsonObject(with: propertiesData) as? [String: Any],
                let id = object.value(forKey: Keys.id.rawValue) as? String,
                let name = object.value(forKey: Keys.name.rawValue) as? String else {
                    return nil
            }
            return InHouseTrackingEvent(id: id,
                                        name: name,
                                        properties: properties)
        } catch {
            print("Error!")
        }
    }
}

Now we can add a method in TrackingEventStorage to convert all given managed object events into InHouseTrackingEvent:

private func events(from coreDataObjects: [NSManagedObject]) -> [InHouseTrackingEvent]? {
    var events = [InHouseTrackingEvent]()
    managedContext.performAndWait {
        for coreDataObject in coreDataObjects {
            if let event = InHouseTrackingEventFactory.makeEvent(from: coreDataObject) {
                events.append(event)
            }
        }
    }
    return events.isEmpty ? nil : events
}

Finally, we will implement a saveContext method to make sure any changes we made will be persisted in the database:

private func saveContext() {
    managedContext.performAndWait {
        do {
            guard managedContext.hasChanges else {
                return
            }
            try managedContext.save()
        } catch {
            print("Error!")
        }
    }
}

Now we are ready to implement our public methods mentioned before. These methods will allow other entities to interact with our core tracking mechanism.

Lets add a typealias to TrackingEventStorage class that we will use for our managed models property keys:

typealias Keys = ManagedInHouseTrackingEvent.PropertyKey

First public method we are going to implement is storeEvent, which will persist given InHouseTrackingEvent as an NSManagedObject.

func storeEvent(_ event: InHouseTrackingEvent) {
    guard let eventEntity = eventEntity else {
        return
    }

    managedContext.performAndWait {
        let managedEvent = NSManagedObject(entity: eventEntity, insertInto: managedContext)
        managedEvent.setValue(event.id, forKey: Keys.id.rawValue)
        managedEvent.setValue(event.name, forKey: Keys.name.rawValue)

        do {
            let propertyData = try JSONSerialization.data(withJSONObject: event.properties)
            managedEvent.setValue(propertyData, forKey: Keys.properties.rawValue)
        } catch {
            print("Error!")
            return
        }
    }

    saveContext()

}

Second one is removeEvents which accepts an array of InHouseTrackingEvent and removes corresponding managed model for each event in the array.

func removeEvents(_ events: [InHouseTrackingEvent]) {
    let eventIDs = events.map { $0.id }

    guard let coreDataObjects = coreDataObjects(withIDs: eventIDs) else {
        return
    }

    managedContext.performAndWait {
        coreDataObjects.forEach { self.managedContext.delete($0) }
    }

    saveContext()
}

Last public method is storedEvents which accepts limit parameter to return stored managed models with the maximum amount of limit.

func storedEvents(withMaximumAmountOf limit: Int?) -> [InHouseTrackingEvent]? {
    guard let objects = coreDataObjects(withMaximumAmountOf: limit) else {
        return nil
    }

    return events(from: objects)
}

Event Sender Implementation

We are going to omit implementation details for the event-sending class for simplicity. InHouseTrackingEventSender is going to have a method to send events that will accept an array of InHouseTrackingEvent and make an URL request to send them to the backend. Moreover, it is going to have a weak delegate property of type TrackingEventSenderDelegate which will be needed to notify once events have successfully submitted to the backend. As you probably noticed, errors are not handled explicitly. If something goes wrong, we simply do nothing and send the same events later on.

InHouseTrackingEventSender:

final class InHouseTrackingEventSender {

    weak var delegate: TrackingEventSenderDelegate?

    func sendEvents(_ events: [InHouseTrackingEvent]) {
        // Make sure there are no ongoing requests and make a
        // post request to the backend by including each event in
        // the body of the request.


        // success:
        delegate?.didSendEvents(events)

        // error:
        // Handle error
    }
}

TrackingEventSenderDelegate:

protocol TrackingEventSenderDelegate: class {
  func didSendEvents(_ events: [InHouseTrackingEvent])
}

Event Batcher Implementation

It is time for us to implement the last part for our tracking service. We need a batching mechanism to make sure our tracking system will work by taking performance, battery, and real-time tracking into account. By providing a batch size, we will try to have an ideal balance between performance and real-time tracking by not triggering a URL request for each event stored, but only triggering once the stored event number meets the batch size. It is going to be a singleton and going to be used to directly track an event. Before implementing the batcher singleton, lets write a simple struct which will be responsible for providing the batch size. We could hardcode this value but providing it via another entity can make it easier and clearer to maintain this information, especially if it can be updated via remote configurations.

struct TrackingEventsBatchSizeProvider {
    let defaultBatchSize = 20
}

extension TrackingEventsBatchSizeProvider: TrackingEventsBatchSizeProviding {
    var batchSize: Int {
        // We just return default size for simplicity but we could get some remote config value
        // at this point and provide it as well.
        return defaultBatchSize
    }
}

TrackingEventsBatcher:

Now we can create the batcher singleton, TrackingEventsBatcher.

It will be initialised with four properties:

shouldBatchEvents: A boolean to indicate if events should be batched or sent immediately.
eventStorage: An instance of TrackingEventStorage.
eventSender: An instance of InHouseTrackingEventSender.
batchSizeProvider: A struct to provide how big the batch size should be.

It will also conform to TrackingEventSenderDelegate to set itself as the delegate of the initialised event sender class.

final class TrackingEventsBatcher: TrackingEventSenderDelegate {

    static let shared = TrackingEventsBatcher()

    var shouldBatchEvents = true

    private var eventStorage: TrackingEventStoring
    private var eventSender: TrackingEventSending
    private var batchSizeProvider: TrackingEventsBatchSizeProviding

    init(eventStorage: TrackingEventStoring = TrackingEventsStorage(),
         eventSender: TrackingEventSending = InHouseTrackingEventSender(),
         batchSizeProvider: TrackingEventsBatchSizeProviding = TrackingEventsBatchSizeProvider()) {
        self.eventStorage = eventStorage
        self.eventSender = eventSender
        self.batchSizeProvider = batchSizeProvider
        self.eventSender.delegate = self
    }

    func didSendEvents(_ events: [InHouseTrackingEvent]) {
        // empty for now
    }
}

As you can see, shouldBatchEvents is a public property so that it can later be modified. Control with this flag will allow us to either submit tracked events immediately or batch them until we hit the batch size. For the simplicity of this article, it will always be true.

Now we will add 2 helper private methods, the first one is to determine if the events should be sent, and another one to send events if needed:

private func shouldSendEvents(_ events: [InHouseTrackingEvent]) -> Bool {
    // Send events if they shouldn't be batched, regardless of their number
    // or only if their number is greater than the batch size, if they should be batched.
    return !shouldBatchEvents || events.count >= batchSizeProvider.batchSize
}

private func sendEventsIfNeeded() {
    guard let storedEvents = eventStorage.storedEvents(withMaximumAmountOf: batchSizeProvider.batchSize),
        shouldSendEvents(storedEvents) else {
            return
    }
    eventSender.sendEvents(storedEvents)
}

We first fetch stored events with batchSize limit and then see if we should be sending events already.

Now we will implement the method which will be the entry point of our whole tracking infrastructure, the following method will be called throughout the application where an entity needs to track an event.

func batchEvent(_ event: InHouseTrackingEvent) {
    eventStorage.storeEvent(event)
    sendEventsIfNeeded()
}

Whenever an event is tracked through batchEvent, we will store it and check if events should be sent.

Finally we will update didSendEvents method as following:

func didSendEvents(_ events: [InHouseTrackingEvent]) {
    eventStorage.removeEvents(events)

    sendEventsIfNeeded()
}

We make sure all submitted events are removed from storage and check if more events should be sent. This logic is needed because the number of stored events might have been more than twice the batch size. This can occur when the app is used offline and no events have been sent for a while.

Usage

Lets see how we can interact with the system with a sample class:

class SampleEntity {
    func trackSomething() {
        let eventName = "example_event"
        let id = "\(eventName)_\(Date().timeIntervalSince1970)"
        let properties: [String: Any] = [
            "propertyOne": "1",
            "propertyTwo": true
        ]
        let event = InHouseTrackingEvent(id: id,
                                         name: eventName,
                                         properties: properties)
        TrackingEventsBatcher.shared.batchEvent(event)
    }
}

Usually, we have different entities for different events in our applications and this manual conversion of properties can be prevented by providing a mechanism to convert properties into required dictionary format through event entities. But for simplicity, we just add two random properties and show how it can be batched here. We could also implement a wrapper function called track, which could internally handle batching as well.

Further improvements for TrackingEventStorage

There are a few more things we need to consider for the TrackingEventStorage. Especially for the saveContext() method. There is a property named isProtectedDataAvailable which lives inside UIApplication. This property will help us to determine if there is data protection active or the device is locked. For such cases we should not attempt to do database operations, otherwise, we might experience some crashes [3].

Let’s add the check for this property as we check if there are any changes as well (in saveContext):

guard UIApplication.shared.isProtectedDataAvailable,
    managedContext.hasChanges else {
    return
}

One could expect this to work right away but now we have another problem. We have implemented our tracking mechanism as thread-safe but we should only be checking UIApplication.shared.isProtectedDataAvailable from the main queue. Thus, we need to check on which queue we are in before attempting to read this value and synchronise with the main if necessary. We could just do if Thread.isMainThread check, but we are going to go with a different solution instead since this check might not just be enough and safe to make sure we can synchronise with the main queue [4].

We are going to use a refactored version of this post to determine which dispatch queue we are running on properly.

DispatchQueue extension:

import Foundation

// Reference https://stackoverflow.com/a/60314121/8447312
public extension DispatchQueue {

    static var current: DispatchQueue? { getSpecific(key: key)?.queue }

    private struct QueueReference {
        weak var queue: DispatchQueue?
    }

    private static let key: DispatchSpecificKey<QueueReference> = {
        let key = DispatchSpecificKey<QueueReference>()
        setUpSystemQueuesDetection(key: key)
        return key
    }()

    private static func setUpSystemQueuesDetection(key: DispatchSpecificKey<QueueReference>) {
        let queues: [DispatchQueue] = [
            .main,
            .global(qos: .background),
            .global(qos: .default),
            .global(qos: .unspecified),
            .global(qos: .userInitiated),
            .global(qos: .userInteractive),
            .global(qos: .utility)
        ]
        registerDetection(of: queues, key: key)
    }

    private static func registerDetection(of queues: [DispatchQueue], key: DispatchSpecificKey<QueueReference>) {
        queues.forEach {
            $0.setSpecific(key: key,
                           value: QueueReference(queue: $0))
        }
    }
}

Now we will add a new method to TrackingEventsStorage to check if isProtectedDataAvailable properly:

private func isProtectedDataAvailable() -> Bool {
    var isProtectedDataAvailable = false

    if DispatchQueue.current == DispatchQueue.main {
        isProtectedDataAvailable = UIApplication.shared.isProtectedDataAvailable
    } else {
        DispatchQueue.main.sync {
            isProtectedDataAvailable = UIApplication.shared.isProtectedDataAvailable
        }
    }
    return isProtectedDataAvailable
}

Let’s change the saveContext method to the following, in order to make use UIApplication.shared.isProtectedDataAvailable is true before saving the context.

private func saveContext() {
    let protectedDataAvailable = isProtectedDataAvailable()
    managedContext.performAndWait {
        do {
            guard protectedDataAvailable,
                managedContext.hasChanges else {
                    return
            }
            try managedContext.save()
        } catch {
            print("Error!")
        }
    }
}

One thing to note is that we are doing queue changing, if necessary, outside of the performAndWait closure. It is needed since perform and performAndWait closures should only be used for changes related to NSManagedObjects.

Conclusion

In this article, we have seen how CoreData can be used for a custom event tracking system implementation. We have built a system to persist events temporarily on the device and submit them to the backend in batches. We have also made sure that such a system can be accessed from different threads/queues and explored ways of properly determining the current queue of the execution.