Programming is Art
Do you enjoy traditional fine arts – music, painting or sculpture? The art of programming is different.
It is alive, interactive and responsive. You can get immediate or delayed feedback and adjust accordingly.
My older brother Henrik introduced me to programming when I was 13 years old. Since then I love it and being a part of Dassault Systèmes presents me with many exciting opportunities for creative problem solving – the art of programming.
I have a background in telecommunications, so optimizing network interfaces is naturally one of my passions.
Having effective network interfaces can really boost the performance of your applications.
Almost every modern application is communicating over network interfaces.
Web applications load HTML, image files and other resources from the Web server.
Desktop applications often communicate with back-end servers, such as for upgrades.
📉 Bad performance?
In order to identify what is causing the performance issues on your network interface, you need to focus on the limiting factors.
Here are a few of them:
- Bandwidth – How many bytes can you transfer per unit of time? It is measured uplink, from the client to the server, and downlink, the other way.
- Latency – How long does it take for a signal to travel from the client to the server, and back?
- DNS lookup time – the client application needs to transmit something to a server -> it needs to translate the server name to an IP address. The DNS service is used for this. The time to perform this lookup + the capacity and performance of the local cache are important factors.
- Routing of the messages through the network can also be a factor – each hub in the route adds delay.
Some of these factors must be improved at the system level. You can also improve the overall performance and the user experience at the application level.
Can we fix it?
First of all, when it comes to the network interface design one size doesn’t fit all.
There are standards and standard frameworks you can utilize.
For example, Microsoft Windows Communication Framework – WCF.
WCF can serialize .NET objects and push them over the wire to the server without much coding on your part. It can automatically serialize and de-serialize .NET object trees. It uses attributes at the class level and on properties as well as reflection to inspect and convert the objects into bytes.
This is expensive in terms of memory and processing power.
Therefore, it can automatically generate assemblies with code to perform the serialization, so the inspection only has to happen the first time when a given object needs to be serialized.
You can also generate the serialization assemblies as part of your build process, so you don’t have to pay the cost at runtime.
This may be a great option for you if
- the client and the server are built on .NET
- the messages are small in size
- the load is low
However, the convenience comes at a price.
The framework is general and has to be able to accommodate any application. It is not going to perform well in all situations.
In project X, we have taken a hybrid approach:
- WCF is used in general
- Specialized serialization and communication are used for when performance is critical
BER to the rescue
Keep the amount of data as small as possible. The counter argument is that the protocol needs to be human readable. This is a fallacy.
Use a tool to decipher. Do not make it complicated – it requires extensive processing or memory to generate and decode.
Keep the protocol close in form to the code that uses it.
Formats like XML and JSON are popular because they can be used to serialize any object model. They incur overhead because they are somewhat human readable.
If the recipient will know the same protocol, the following elements are unnecessary:
- JSON – property names
- XML – attribute names and element names
This information will just add to the number of bytes that need to be transmitted.
Use Binary Encoding Rules – BER, a better option to reduce the number of bytes sent.
BER is a specific encoding of a protocol specified using the Abstract Syntax Notation – ASN.1.
There are standard libraries you can use to generate code that encodes and decodes in BER from an ASN.1 file.
Lightweight Directory Access Protocol – LDAP is an example of a protocol that uses BER.
To compress or not to compress?
If your application is sending a large amount of information between client and server, it is tempting to use some kind of compression to reduce the number of bytes transmitted.
What you must consider in this situation is that you will incur memory and processing overhead by compressing your messages. Some content is already compressed, like JPEG, and DOCX, so it is not a good idea to blindly compress everything that gets sent. For instance, using a filter at the framework level.
Small messages especially will not benefit from compression. In some cases the compressed message will get even bigger than the uncompressed one.
For example, if I compress a file with 1 byte in it, the resulting ZIP file will have about 160 bytes.
✋ How much data is too much?
Transferring data in a block that is too big can make the user experience worse. The user will not be able to proceed until the entire block of data is received. It is better to send data as needed.
As an example, an application is used to edit a table of information.
In the extreme case, the entire table is transferred, before it is presented to the user for editing. Once the user is done editing, the entire table is transferred back to the server. Perhaps the table contains thousands of entries, but the user interface only allows for displaying 25 rows at a time.
🎯 Better Strategy
A better strategy in this context would be to transfer only the amount of rows that can be displayed to the user. As the user scrolls down, additional rows can be requested from the server automatically, perhaps one page at a time. It can also be detected, if the user scrolls very quickly. In this case, it could make more sense to jump ahead and transfer a page further down, and skip all the pages in between.
As the user edits the table, only the modified row is transferred back to the server. In general it will help to reduce the amount of data sent, if you only transmit information about what the user did, and only on the smallest amount of data that can be identified as being related to the action. It can be as specific as an individual key stroke. As an example, the user marks three characters in a text field and pressed the delete key. This action can be communicated to the server by identifying the field, the starting position, and the length of the selection. This will in most cases be more effective than re-transmitting the content of the field. Maybe you need your application to make that decision. If the field is small, re-transmit. If it is large, apply the other strategy.
You can also have your application predict what the likely next user action will be, and then fetch that information in the background before the user performs the action. This prediction can be based on the individual user’s previous behavior when using the application.
🧵 String or Batch?
When many messages are sent quickly one after another, it can be a symptom that your protocol is too chatty. The roundtrip time of the network connection can slow this down, if you need to wait for the response for each message before proceeding.
An example of this could be that your protocol has a message for deleting a row in the table. The user marks several rows and presses the delete key. In this case, it would be better to send all the messages batched together into one message. It would be even better if the message simply indicates the starting row index and the number of rows to delete.
When deciding whether to string individual messages or batching them together, the important factor is how close it is to an actual user action. The objective is always to make data available to the user as quickly as possible, and only what is needed at that time.
💡 I know!
All communication is more optimal when everyone involved knows the context. This is true in computer network communication as well as communication between people. We know that two friends can be so attuned to each other’s thoughts and ideas that they can even finish each other’s sentences. That is because they have shared the context.
We can use this principle to optimize network protocols as well. Don’t transmit something that is already known.
We see this utilized in the HTTP protocol, when we use the If-None-Match header. The client specifies the ETag header value previously received for the resource, and the server will then either return an updated resource with a 200 status code, or a status code of 304, if the resource has not changed.
You can implement something similar in your own protocols. You can also preload information to the server or the client, and then later refer to it, rather than sending it at that time. An example of this could again be editing of a table. If each user action is transmitted with a common transaction identifier, we can commit or discard the actions with a message that refers to the transaction.
Can you keep up?
I hope this has inspired you to experiment. Modify your protocols – make them more performant. Provide your users with a better experience!
And consider joining us at BIOVIA – we need people with passion for software optimization and creative IT streak.