Think

Handle your big data forms

How to address data integrity and performance and many more in .Net

Autor

Marie Putri Wulandari
Software Architect
bei SYZYGY Techsolutions

Lesedauer
9 Minuten

Publiziert
09. Februar 2024

Handling a form with big data is one problem in web development. However, the solution is not always trivial, especially when it already impacts the performance not only in server-side (Backend), but also in client-side (Frontend). Nobody wants to use a system with slow reaction in GUI, or high risk of losing data during working/saving process. So, how can we solve this problem in our project?

This article focuses more on the implementation of the solution in Backend. The Frontend implementation is described more in the following article: Praxisbericht: Große Formulare

Problem

Reading and writing big data in a GUI form is a common topic in software development area and there are already numerous solution variants available. The main challenges usually focus on addressing the following software qualities:

  • Data integrity: system should assure the data accuracy and consistency over its life cycle
  • Performance: system should provide a certain acceptable level of responsiveness under an expected range of workload
  • Usability: system should provide functionalities and/or user interfaces that are easy and understandable for user to use

 

With a classic approach, which is to show all fields in one form and then to save all at once, these software qualities usually cannot be fulfilled, especially in a distributed system.

  • Data integrity can be violated, when the data is being edited by two or many users at once. Usually, the latest user who commits the changes wins, and the changes from the earlier user(s) are ignored without their acknowledgement
  • Performance is difficult to assure, because sending big data to server and saving to persistent data storage, e.g. DBMS is always limited to the bandwidth and processing capacity
  • Usability is difficult to achieve, when user must work with a big form and monitor all the changes. Moreover, it gives burden to the Frontend to maintain the form, which affects performance in the client-side as well

 

As a sample in one of our projects, we faced the following challenge: We had to provide users the functionality to edit data entity that had many fields (we named it Order, with many OrderPositions which had many Fields). User could change the value of OrderPositions one-by-one in the form, however the saving function should apply to the Order and all its OrderPositions (all-or-nothing). System also had to ensure that only one user could work on one Order at the same time (exclusive writes).

The classic approach was obviously not the optimal solution. It might be still acceptable if the Order had up to 50 OrderPositions. But the business process demanded that the Order could had up to 500 OrderPositions, which would affect the performance significantly.

On server-side, data integrity was a challenge because we had to handle the situation when multiple users want to change the same data at the same time. Meanwhile, blocking original data for other users to maintain consistency was also not an option. Additionally, the system had to provide filtering and paging functionality, so that user could focus on only up to 10 OrderPositions at one time in the page.

With these functional and non-functional requirements to address, we found the following solution.

Solution

The main idea is to create a temporary Order (a snapshot from the original Order) during the writing (editing) life cycle in Backend. As user makes changes in the form, for example the OrderPosition, Frontend sends the changes of each OrderPosition directly to Backend, and Backend saves the changes to the temporary Order. When user triggers the saving function, the Backend applies the changes in temporary data to the original Order in persistent data storage.

This solution has following impacts:

  • Order data in persistent data storage is always consistent, because system either saves or rolls-back the changes completely
  • System can ensure that only one user can work on the same Order at the same time, by validating the owner of temporary Order data
  • Because there can be only one user who can work on the same Order at the same time, the Order is blocked to write by other users until the current writing process finishes. There should be a defined time span on how long the blocking should last until we can assume that current writing process is cancelled
  • Original Order data is not blocked for reading during the writing life cycle, except when Backend applies the changes to persistent data storage
  • During the saving function, Frontend only needs to send the id of the Order, because the data changes are already in the temporary Order
  • Frontend has more flexibility on showing a set of OrderPositions in the form, e.g., by filtering and paging, instead of always showing all OrderPositions

Data

The structure of data for the writing process is shown in the following diagram.

Entities

The original data of the Order contains:

  • OrderId: the id of the Order
  • EditableField_x: editable fields from the Order (user can change its value during the writing process)
  • ReadonlyField_x: readonly fields from the Order (field that will not be changed during the writing process)
  • At least one to many OrderPositions. Each OrderPosition contains:
    • OrderId: foreign key to the Order
    • OrderPositionId: id of the OrderPosition
    • PositionEditableField: editable fields from the OrderPosition
    • PositionReadonlyField: readonly fields from the OrderPosition
public class Order
{
    public int OrderId { get; set; }
    public object EditableField_1 { get; set; }
    public object EditableField_n { get; set; }
    public object ReadonlyField_1 { get; set; }
    public object ReadonlyField_n { get; set; }
 
    public IList<OrderPosition> OrderPositions { get; set; }
}
 
public class OrderPosition
{
    public int OrderPositionId { get; set; }
    public object Position_EditableField_1 { get; set; }
    public object Position_EditableField_n { get; set; }
    public object Position_ReadonlyField_1 { get; set; }
    public object Position_ReadonlyField_n { get; set; }
}

Basically, the temporary data (OrderTemp) is a snapshot from the original data. The readonly fields are excluded for efficiency, so that the temporary data is slimmer and reducing the cost of saving the temporary data.

There are some additional fields in OrderTemp, that are necessary for the writing process:

  • TempKey: a unique key for the writing process of the Order, for example using Guid
  • TempValidUntil: the time when the temporary data should be expired
  • TempOwner: the user who currently edits the Order. In this case, it is the id (integer) of the user
public class OrderTemp
{
    public int OrderId { get; set; }
    public object EditableField_1 { get; set; }
    public object EditableField_n { get; set; }
    public Guid TempKey { get; set; }
    public DateTime TempValidUntil { get; set; }
    public int TempOwner { get; set; }
 
    public IList<OrderPositionTemp> OrderPositionTemps { get; set; }
}
 
public class OrderPositionTemp
{
    public int OrderPositionId { get; set; }
    public object Position_EditableField_1 { get; set; }
    public object Position_EditableField_n { get; set; }
}

Original data is naturally stored in persistent storage. Temporary data can be stored, either in persistent storage, or in temporary storage, e.g. cache or in-memory DB.

Workflow

The workflow of writing process is shown in the following diagram.

Workflow

  • Step 1 – Check OrderTemp, to check if user has an active/valid OrderTemp to edit the Order
  • Step 2 – Create New OrderTemp, to create a new OrderTemp for the Order with a generated TempKey
  • Step 3 – Edit OrderTemp, to allow user to edit OrderTemp in Frontend, by providing filtering and paging features
  • Step 4 – Update OrderTemp, to send the changes to Backend and apply it to OrderTemp
  • Step 5 – Commit Changes, to apply and save the changes in OrderTemp to the Order in the persistent storage

Step 1 – Check OrderTemp

The purpose of this step is to ensure that only one user can edit an Order at the same time. User should provide a valid TempKey, and additionally the OrderId. System should validate if with the given TempKey and OrderId editing is allowed.

This step is implemented as a function IsOrderTempValid, which takes TempKey, OrderId and UserId as input parameters, and returns true if validation succeeds. The validation includes:

  • OrderTemp with the given TempKey exists in the storage
  • OrderTemp has the given orderId
  • OrderTemp is not expired
  • OrderTemp is owned by the user

public bool IsOrderTempValid(Guid tempKey, int orderId, int userId)
{
    // read from persistent storage
    var orderTemp = GetOrderTemp(tempKey);
 
    if (orderTemp == null)
    {
        return false;
    }

    if (orderTemp.orderId != orderId)
    {
        return false;
    }
 
    if (orderTemp.TempValidUntil < DateTime.Now)
    {
        return false;
    }
 
    if (orderTemp.TempOwner != userId)
    {
        return false;
    }
 
    return true;
}

Step 2 – Create New OrderTemp

This step implements a process to create a temporary snapshot of Order (OrderTemp), based on the OrderId given by the user. OrderTemp should have a unique generated key and a fresh snapshotted Order data, taken from persistent storage. The life span of OrderTemp (TempValidUntil) should also be set with a constant life span definition, for example in minute.

This step is implemented as a function CreateOrderTemp, which takes OrderId and UserId as input parameters, and returns OrderTemp as an output. OrderTemp then can be saved, either in persistent or temporary data storage.

public const int TempValidInMinutes = 10; 

public OrderTemp CreateOrderTemp(int orderId, int userId)
{
    // read from persistent storage
    var originalOrder = GetOriginalOrder(orderId);
 
    return new OrderTemp
    {
        OrderId = originalOrder.OrderId,
        EditableField_1 = originalOrder.EditableField_1,
        EditableField_n = originalOrder.EditableField_n,
        TempKey = Guid.New(),
        TempValidUntil = DateTime.Now.AddMinute(TempValidInMinutes),
        TempOwner = userId,
        OrderPositionTemps = 
            originalOrder.OrderPositions
                .Select(p => new OrderPositionTemp
                {
                    OrderPositionId = p.OrderPositionId,
                    Position_EditableField_1 = p.Position_EditableField_1,
                    Position_EditableField_n = p.Position_EditableField_n
                })
                .ToList()
    }
}

Step 3 – Edit OrderTemp

This step extends OrderTemp with relevant features for user to work on in frontend, including the ability to filter and paging the OrderPositions. Backend should send back the requested OrderPositions from the OrderTemp.

This step is implemented as a function GetEditingOrderPositionTemps, which takes FilterParameter and PagingParameter as input parameters, along with TempKey, OrderId and UserId. Only when the OrderTemp is still valid, then the System returns the OrderPositionTemps, based on the given Filter and Paging. Frontend then generates the form with only OrderPositionTemps from this function, instead of all OrderPositionTemps in OrderTemp.

public IList<OrderPositionTemp> GetEditingOrderPositionTemps(
    Guid tempKey, 
    int orderId, 
    int userId, 
    FilterParameter filter, 
    PagingParameter paging
)
{
    if (!IsOrderTempValid(tempKey, orderId, userId))
    {
        throw Exception();
    }
 
    // read from persistent storage
    var orderTemp = GetOrderTemp(tempKey);
    orderTemp.ExtendValidity(); 
 
    var orderPositions = orderTemp.OrderPositionTemps;    
    orderPositions = ApplyFilter(orderPositions, filter);
    orderPositions = ApplyPaging(orderPositions, paging);
 
    return orderPositions;
}

Step 4 – Update OrderTemp

This step implements the functionality to save the changes from Frontend to OrderTemp in Backend. Frontend should always send TempKey, OrderId and UserId, along with the changes of OrderPositionTemp when updating OrderTemp. Only when the OrderTemp is still valid, the System saves the OrderTemp containing the changes in data storage.

public void UpdateOrderTemp( 
    Guid tempKey,  
    int orderId,  
    int userId,  
    OrderPositionChanges changes 
) 
{ 
    if (!IsOrderTempValid(tempKey, orderId, userId)) 
    { 
        throw Exception(); 
    } 
 
    // read from persistent storage 
    var orderTemp = GetOrderTemp(tempKey); 
 
    orderTemp.Edit(changes); 
 
    orderTemp.ExtendValidity(); 
 
    Save(orderTemp); 
}

Step 5 – Commit Changes

This step is executed when user finally triggers the saving function to apply all changes. Frontend only needs to send the TempKey and OrderId to Backend, instead of the complete data changes. Only when the OrderTemp is still valid, the System applies the changes from OrderTemp to the original Order data and saves all in one transaction to the persistent data storage.

public void SaveOrder(Guid tempKey, int orderId, int userId)
{
    if (!IsOrderTempValid(tempKey, orderId, userId))
    {
        throw Exception();
    }
 
    // read from persistent storage
    var orderTemp = GetOrderTemp(tempKey);
    var originalOrder = GetOriginalOrder(orderId);
 
    originalOrder.ApplyChanges(orderTemp);
 
    SaveToPersistent(originalOrder);
}

OrderTemp – Extend Validity

It is not practical to set a certain life span (for example 10 minutes) for the writing process, because it is very likely a user needs longer time to work with big data. On the other hand, to increase the life span (for example 1 hour) has also limitation because there are many scenarios where the writing process ends unintentionally, for example when the user browser crashes, or user just leaves browser open without any more interaction. In this case, the Order would be blocked for other users to write until the OrderTemp is not valid anymore.

As an optimal solution, we allow the validity of the OrderTemp to be extended, everytime a user makes an interaction, for example when user gets (by filtering/paging) OrderPositionTemps (Step 3), or when user updates an OrderPositionTemp (Step 4).

public const int TempValidInMinutes = 10;

public class OrderTemp
{
    public void ExtendValidity()
    {
        TempValidUntil = DateTime.Now.AddMinute(TempValidInMinutes);

        Save();
    }
}

Conclusion

The solution was designed to address specific requirements in the project, such as having only one user to edit data at one time. However, the general idea of having a temporary data can be applied to other use cases.

The main disadvantage from this solution is to maintain temporary data in the storage. However, it is not considered as a critical drawback, because usually storage capacity is not the main restriction. Storing only editable fields in the temporary data, instead of the complete data is one way to mitigate high storage cost. Additionally, we can also provide additional feature to delete expired temporary data regularly, so that the storage saves only the relevant (valid) data.

The implementation of the solution has been released more than a year ago with super performance and no significant bugs. Users are happy, mission accomplished 😊.

Interessiert?
Wir freuen uns über Dein Feedback!
Michael Wolf
Head of Technology
On this page