Aug 18, 2017

Friday Facts #204 - Another day, another optimisation

Factorio - Klonan

Hello,as the team is getting slowly bigger and we still don't have any dedicated project manager, we had to start looking for tools to help us manage the team. We are testing software that allows our team members to track time spent on individual tasks, so right now my timer on "Friday facts related work" is running. I hope it to give me better insight into what kind of tasks our time goes to, where are we losing most of it, or what were the people doing when I was not here. People tend to not like these kind of changes, but we just have to admit that we are not the 4 people punk development team working from our living room and we need to invest more time into working efficiently.

Prefetching (Technical)

Kovarex already presented a concise summary of the prefetching patch, here is some more background and dirty technical details.

I started to look into Factorio performance improvements a while back, more specifically UPS (updates per second) improvements for large bases. It is widely recognized that the UPS are mostly limited by memory performance (more). That is normal - even highly optimized scientific simulation codes are rarely limited by arithmetic instructions.

At first, I looked into ways to reduce the size of Entities. Common entity sizes like Inserter (536 bytes) or AssemblingMachine (648 bytes) seem surprisingly large at first. I tried some changes, e.g. moving less frequently accessed data out of the actual entity in a separate object in memory. These changes had significant impact to the code in many files, but just saving a few bytes didn't make a measurable impact to performance.

Back to a bit of theory - there are two different ways in which memory can become a bottleneck: bandwidth (the amount of data supplied over time, e.g. 50 GB/s) and latency (the time until a requested piece of data is available, e.g. 50 ns). Comparing the results for different RAM timing settings (CAS latency) shows, that latency has a significant impact. It is important to note, that Factorio is not a homogeneous workload - some parts are still limited by memory bandwidth, others by CPU.

Modern CPUs are extremely good at mitigating memory bottlenecks by using caches, speculative execution and prefetchers. However, all active entities are read at every tick of the game. In large factories, this is too much data for caches. Also a virtual function call - such as the update of an entity - cannot be executed speculatively. Prefetchers are a part of the CPU that predicts what memory is going to be accessed soon and transfers it even before it is explicitly loaded. But since the entity update loop iterates over a linked list - the address of the next entity is stored within the entity itself - it is difficult to predict (not impossible).

This is where software prefetching comes in - the programmer gives a hint to the CPU what memory is accessed soon. That is what we now do in Factorio: Before an entity is updated, the next entity is already requested so that it can be loaded in the background. The principle also applies to a few other loops over linked lists. The nice thing about this, is that it is an extremely simple and isolated change in the code. The downside is, that you are entering the realm of architecture-specific micro-optimization. If you aren't careful, it can even be bad for performance.

A good rule is to never guess about performance - always verify. So I did some tests with different maps and the results were promising. Entities are larger than a single cache line and the pointers point into the middle of the object due to multiple inheritance. Many experiments later, the optimal range showed to be -128 byte to +384 byte (8 cache lines). This coincides with the sizes of typical entities. The prefetching instruction has another parameter determining the cache level used - which again was determined experimentally.

To get a bit more diversity, the measurements for this chart were done on a different CPU (i7-6700K vs i7-4790K previously), and include some more maps. It showed that the new belt-heavy map got less speedup (+5%) from software prefetching than the others. As a remedy, this map gets a huge boost from the belt optimization before. Other saves got a nice 9-13% speedup. All measurements are averages update times over 3600 ticks, the boxplots show 20 repeated runs.

Overall software prefetching is a nice effective micro-optimization with very little code changes, but many measurements to find the right configuration and verify.

Crafting machine animation optimisation

The issue is, that crafting machines can have arbitrary count of secondary animations tied to it (rotating fan, liquid in the chemical plants etc.). As each of the animations can have different speed and frame count, we kept positions of all of these animations in dynamically allocated vector and just updated each of these independently whenever the crafting machine was producing. But now, we just have one number representing the overall offset of the animations. We move it depending on the speed of the crafting machine and all the animations calculate their cyclic position depending on the modulo of this value only when we need to actually draw the machine.

This means, that this complicated code:

void CraftingMachine::setupWorkingVisualisationFrames(double performance)
{
const CraftingMachinePrototype& prototype = *this->getPrototype();
this->frame.move(performance, prototype.animation.getAnimation(this->direction));
if (this->workingVisualisationFrames.empty())
{
this->workingVisualisationFrames.resize(prototype.workingVisualisations.size());
for (size_t i = 0; i < this->workingVisualisationFrames.size(); ++i)
this->workingVisualisationFrames[i].randomize(prototype.workingVisualisations[i].getAnimation(this->direction),
this->getMap().getRandomGenerator());
}
for (size_t i = 0; i < this->workingVisualisationFrames.size(); ++i)
this->workingVisualisationFrames[i].move(prototype.workingVisualisations[i].getAnimation(this->direction));

Becomes this simple:

void CraftingMachine::setupWorkingVisualisationFrames(double performance) { this->frameReference += performance; this->showWorkingVisualisations = true; }

The memory size of crafting machine is decreased and the overall performance of game is improved by additional 2%.
Another day, another optimisation :)

HR Lab

The weekly dose of update high resolution graphics:

Related to HR entities, It turned out that our zooming system never showed an exact zoom of 2.0, which would be the 'pixel perfect' zoom level for the HR entities. By changing the zoom rate from 1.1, to the 7th root of 2 (1.104089...), the zoom now increments perfectly from 1.0 to 2.0 in 7 steps.

As always, let us know any thoughts or feedback over on our forum.

Aug 18, 2017

Friday Facts #204 - Another day, another optimisation

Factorio - Klonan

Prefetching (Technical)

Crafting machine animation optimisation

Becomes this simple:

void CraftingMachine::setupWorkingVisualisationFrames(double performance) { this->frameReference += performance; this->showWorkingVisualisations = true; }

The memory size of crafting machine is decreased and the overall performance of game is improved by additional 2%.
Another day, another optimisation :)

HR Lab

The weekly dose of update high resolution graphics:

Aug 11, 2017

Friday Facts #203 - Logistic buffer chest

Factorio - Klonan

Further optimisations

I finished the item stack optimisations mentioned in FFF-198, and was able to do some performance tests. First I tested how many stacks on a big map actually need to use an externally allocated object (Item), and how many of them are plain. On the huge map I tested, it turned out that only 36K out of 1M stacks need the Item object. These were mainly science packs, as they need it for the progress of how used-up they are (and now when I think about it, it could also be omitted by only using the objects for science packs that are partially used up already). Overall factory performance was increased approximately 2% by this. It is nothing huge, but every bit matters.

One of the programmer that has read access to the code (Zulan), came up with a pull request that improves performance in Factorio by prefetching memory in the update loops ahead.

The problem when normally updating objects is, that CPU asks for memory representing the object. The memory is slow, at least compared to the CPU cache or the CPU speed. The memory transfer speed itself is not that slow, but the waiting (latency) time between ordering and receiving it is. This means, that what very often happens is, that CPU orders data of next entity from the memory, then it waits for quite a long time to get it, and then it does its logic. The memory prefetching partially solves it by doing this:

Order data of the next entity from memory (prefetch)
Do the logic of the current entity in the meantime
Go back to start

The overall measured performance improvements vary between 6-10%, which is certainly a nice addition.

Logistic buffer chest

As flexible and powerful as it is, we have always felt there was one key missing to the puzzle. The main issue is that requester chests cannot provide their items to any other member of the logistics system. Trying to workaround this by putting an inserter to a passive provider, just leads to the robots moving the items in a loop. This is also a nuisance trying to supply construction robots with materials, as they can only collect them from storage or provider chests, and they are only typically located in the main base areas.

It is easy to design a system to resupply far out areas using trains to directly put items in provider chests, but if they are in the same logistic network we encounter the same loop as before. We were also concerned of people segregating their logistic networks for more control, it seems to us it was a workaround to a problem we should fix. The solution is the buffer chest, which functions as both a requester and passive provider chest.

You can see the buffer can act as an 'in-between' for storage/provider chests and the requesters. This leads to a solution of the main annoyances we identified.

Typically when you set up all the provider chests, they are spread out across your whole factory. When you return to base for a resupply, you end up waiting for a long time while the bots travel from all over the factory with items. By using a buffer chest, you can setup a dedicated 'supply area', where the buffer chest will already contain all the typical items, and the bots can quickly top-up your inventory.

Another problem is when you have a large perimeter defence, and you want it to be maintained by the construction robots. When the main base is so far away, it can take a long time for robots to arrive with repair packs, so the biters might be able to break through. Using the buffer chest, it will be easy to setup nearby supplies to quickly repair the walls when needed.

Last, but not least use-case is when you want to dedicate part of your storage for specific things. The reason can be either just OCD, or the fact, that you can make sure that too much coal for example won't make it impossible to store enough of iron ore in your storage system.

High resolution robots

Here we present you the regular dose of new entities updated for high resolution for the hopefully fully high-res friendly 0.16 release.

As always, leave us any feedback or comments on our forum

Aug 11, 2017

Friday Facts #203 - Logistic buffer chest

Factorio - Klonan

Further optimisations

Order data of the next entity from memory (prefetch)
Do the logic of the current entity in the meantime
Go back to start

The overall measured performance improvements vary between 6-10%, which is certainly a nice addition.

Logistic buffer chest

High resolution robots

Here we present you the regular dose of new entities updated for high resolution for the hopefully fully high-res friendly 0.16 release.

As always, leave us any feedback or comments on our forum

Aug 4, 2017

Friday Facts #202 - High res circuit connectors

Factorio - Klonan

I decided to write about the results of the item stack optimisations explained in the FFF-198, so I rushed today to finish its implementation, just to find out that the task affects an even bigger part of the code than I expected, Items are related to many things in Factorio :)

After many hours of rewriting and fixing, I can compile it and even start a game, but most of the things are broken. It is quite funny to see some of the basic item interactions to be broken. Now I'm making commits like "Now I can split stacks", "Now I can merge stacks", etc. It reminds me the old days. In conclusion, the details of the optimization will have to wait for next week, and since it is after 10pm, this Friday facts will be somewhat shorter :)

High res and improved circuit connectors

I can at least present you the continued work on the updated high resolution graphics. The update of circuit connectors not only provides them in high resolution, but as it is possible to see it in more detail, the graphics can show more accurately what it represents. Specifically, if the connector is only reading the state of the machine (blue LED), controlling its behaviour (red/green LED) or both.

You can also clearly see how weird it looks when we combine a low res entity (roboport, chests, liquid tank) with a high res connector, but this is just a temporary state and most of these entities should be high-res compatible when the release is ready.

You can also notice (on the chests for example), that the green and red connectors are not vertically aligned, which might look slightly weird, but it is on purpose. In the current version (0.15), when two entities are in one column and they are using both the green and red cable, they overlap perfectly so it is impossible to see both of the cables at the same time as shown below.

Additionally, the green LED in this example doesn't make any sense, as the chest will always have only read mode, which is also addressed in the new graphics.

As always, leave us any feedback or comments over on our forum

Aug 4, 2017

Friday Facts #202 - High res circuit connectors

Factorio - Klonan

High res and improved circuit connectors

Aug 2, 2017

Factorio 0.15.32 released

Factorio - posila87

Bugfixes

Fixed compatibility problem with several antivirus programs. more
Fixed seed in map-gen-settings.json would be ignored when creating map on headless server. more
Fixed that connecting to a multiplayer game with a large blueprint library might be difficult. more
Fixed that using capsules would open an Entity's GUI when clicked. more
Fixed that --window-size=maximized wouldn't work on Linux. more
Fixed that changing reactor consumption(production) values through a mod didn't update its production until rebuilt. more
Fixed that blueprints would sometimes stop transferring.
Fixed crash when opening item/container and at the same time the controller is set to some that doesn't have inventory. more
Fixed 3 possible crashes related to getting malformed network packet over the network.
Maybe fixed a biter path cache-related crash. more
Fixed that bad_alloc and similar low level erros were catched internally, so we couldn't get proper stack trace of those.
Limited the size of a train chart tag when the map is zoomed in. more
Possible rare crash fix related to building rails and viewing preview of entities right after that. more
Limited technology cost multiplier to maximum of 1000. more

Scripting

The log method also specifies the mod that wrote that, not only script file.
Added LuaEntityPrototype::distribution_effectivity read.
Added LuaEntityPrototype::time_to_live read.
Added LuaControl::following_robots read.
Added LuaPlayer::pipette_entity().
Added LuaEntity::can_be_destroyed().
Added script_raised_destroy reserved event ID.
Added script_raised_built reserved event ID.
Added script_raised_revive reserved event ID.
Changed LuaEntity::time_to_live to also work for combat robots.
Changed LuaEntityPrototype::fluid_capacity read to also work on fluid-wagon.
Changed LuaEntityPrototype::turret_range read returns nil instead of error if not turret.
Changed LuaEntity::train to return nil if entity is not rolling stock.
Added LuaEntityPrototype::explosion_beam read.
Added LuaEntityPrototype::explosion_rotate read.

You can get experimental releases by selecting the 'experimental' beta branch under Factorio's properties in Steam.

Aug 2, 2017

Factorio 0.15.32 released

Factorio - posila87

Bugfixes

Fixed compatibility problem with several antivirus programs. more
Fixed seed in map-gen-settings.json would be ignored when creating map on headless server. more
Fixed that connecting to a multiplayer game with a large blueprint library might be difficult. more
Fixed that using capsules would open an Entity's GUI when clicked. more
Fixed that --window-size=maximized wouldn't work on Linux. more
Fixed that changing reactor consumption(production) values through a mod didn't update its production until rebuilt. more
Fixed that blueprints would sometimes stop transferring.
Fixed crash when opening item/container and at the same time the controller is set to some that doesn't have inventory. more
Fixed 3 possible crashes related to getting malformed network packet over the network.
Maybe fixed a biter path cache-related crash. more
Fixed that bad_alloc and similar low level erros were catched internally, so we couldn't get proper stack trace of those.
Limited the size of a train chart tag when the map is zoomed in. more
Possible rare crash fix related to building rails and viewing preview of entities right after that. more
Limited technology cost multiplier to maximum of 1000. more

Scripting

The log method also specifies the mod that wrote that, not only script file.
Added LuaEntityPrototype::distribution_effectivity read.
Added LuaEntityPrototype::time_to_live read.
Added LuaControl::following_robots read.
Added LuaPlayer::pipette_entity().
Added LuaEntity::can_be_destroyed().
Added script_raised_destroy reserved event ID.
Added script_raised_built reserved event ID.
Added script_raised_revive reserved event ID.
Changed LuaEntity::time_to_live to also work for combat robots.
Changed LuaEntityPrototype::fluid_capacity read to also work on fluid-wagon.
Changed LuaEntityPrototype::turret_range read returns nil instead of error if not turret.
Changed LuaEntity::train to return nil if entity is not rolling stock.
Added LuaEntityPrototype::explosion_beam read.
Added LuaEntityPrototype::explosion_rotate read.

You can get experimental releases by selecting the 'experimental' beta branch under Factorio's properties in Steam.