The guguniverse

Sebastián Uribe

A short history of the guguniverse

What would happen, if all our knowledge of the world was stored in a single place, then it became sentient and emancipated from us? What if it would be there for you anywhere, anytime you needed it, but it also recorded everything you did? How would people behave if they knew that their every word and gesture was being made public? Of if part of their personal, familiar or national history had to be erased, to make space for someone else's? The guguniverse is a work of speculative fiction exploring the problem of "peak storage" - the point at which we run out of space to store new information - and the social, political and technological consequences of trying to solve it.

About 12 years ago, I realized that not only are we creating huge amounts of information every day, but that this amount increases exponentially, roughly doubling every two years. This shouldn't surprise anyone familiar with technology, where similar growth trends have been observed, in things like transistor density or the size of software. And it makes sense, that having our globalized economy outgrown the supporting capacity of our planet, it looks for new frontiers to colonize in the digital realm. The problem is that this digital economy does not exist in a vacuum; every bit of information needs to be stored somewhere. Eventually, we will run out of materials to build storage, and out of storage to put our bits in. What will happen then?

A quick calculation showed me that this could happen some 200 years from now. The world, then, should start preparing for it much earlier, during the lifetime of those being born now. Our physical growth will have to stop within the next hundred years, as we are already consuming too many resources for our planetary systems to remain in a stable, life supporting state. We know that we are reaching tipping points, that millions of people are already suffering the consequences of climate change, and billions more will follow in the near future. The problem of peak storage seems almost trivial in comparison.

The solution, proposed by a group of scientists and engineers, is to gain some time by centralizing all our knowledge of the world, and optimizing the shit out of it. This is a reflection of the almost religious way in which we tackle problems nowadays, placing unjustified expectations in unproven technologies, instead of facing the deep and difficult task of changing our behaviors and our society. Of course, something unexpected happens in the story, and Gugu, the sentient, all-knowing AI, is born.

At first, I wanted to create an old-school, point & click, adventure game based on this premise. The protagonist, an ordinary government worker, finds an error in this vast repository of information, uncovers some buried truths, and ends up confronting Gugu. It was in 2020, when I decided that instead of a game, I could tell the story through a novel. Since then, I've written a hundred pages, and I'm halfway through. I would have continued at a leisurely pace, were it not for the fact that Machine Learning services started popping up everywhere, and becoming part of the daily conversation for a lot of people. Suddenly, Gugu was becoming less of a key plot in a novel, and more of a reality from our near future.

One thing that I noticed, when talking about the story, was how much interest people had in the world that I was creating. So instead of taking my sweet time to finish the novel and risking writing yesterday’s news, I decided to organize my ideas for this world, the guguniverse, and write them down as blog posts that could be enjoyed in small bites. I’m publishing them as I write, and will soon include some short stories. Hopefully, at some point in the future, the full novel too.

At the moment of this writing, the guguniverse is still a work in progress. It is an interesting experience, as each new post teaches me something about this world, that I didn’t know before. I am enjoying these discoveries very much, and I hope that you enjoy them too.

Sebastián Uribe - Berlin, January 2023

What is Peak Storage?

Emails, photos, e-books, business presentations, movies, music; human beings on the Internet are constantly creating digital content. Even when we are just consuming, our online behavior is registered somewhere, leaving a digital trace. And all this information is stored somewhere for an indefinite time.

By some estimates, humanity is creating in the order of 1018 bytes per day, or the equivalent of 33 million 4k movies. And that number doubles approximately every two years. This means that it will quadruple in four years, increase eight-fold in six years, sixteen-fold in eight years, and so on and so forth. That's many millions of movies.

Can this number increase forever? All information resides somewhere in the physical world, and we cannot break the laws of physics. Let's imagine that we have a storage device, capable of storing one byte per atom. If the current growth trend holds, by the year 2216 we will produce around 1050 bytes per day. That corresponds approximately to the number of atoms on our planet. We clearly cannot turn the entire planet into a gigantic storage device!

Couldn't we create a device in outer space, using materials from other planets? Even ignoring the practical difficulties, like the fact that an object with that mass would affect our orbit around the sun, we only need to wait an additional two hundred years, until our hunger for information reaches the astrological number of 1080 bytes per year. Why is that number relevant? Because it is the number of atoms in the observable universe.

But what about compression? Quantum computers? Holographic storage? Some disruptive tech created by two guys in a Silicon Valley garage? All that technology could potentially do, is to increase the capacity per atom by a constant factor, while the growth is exponential. Let's see what that means with an example. If we could compress our information 1000 times, then we would need only 1018/1000, or 1015 bytes of daily storage per day. We said we double the number of daily bytes every two years, so after 20 years (doubling 10 times) we would have 1015 x 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 bytes, which is equivalent to 1015 x 1024, or roughly 1018 again! As you see, it doesn't matter how much we compress the information, it will only delay reaching this limit, by a fixed number of years.

These calculations are not very precise, and some numbers are probably wrong. It might be the amount of daily information, or the rate at which this increases. Maybe we manage to store more than one byte per atom. Maybe we have 500 years before we reach any limit, maybe much less. It doesn't matter, because save for an unforeseen event or catastrophe, sooner or later we will reach it.

There will come a time, when no new information can be stored, at least not without deleting something else. Our hard drives will be full. That is peak storage.

The beginning of History

Peak storage is unavoidable, but humanity doesn’t seem interested in curing its addiction to growth. What can be done about it?

In the year 2035, a group of scientists and engineers, concerned about the economic and political consequences of peak storage, founded the Human Data Corporation. Maybe it's not possible to avert peak storage, they thought, but we might be able to minimize its impact on our society. By increasing the efficiency of storage, it might be possible to delay the peak. This could buy us some time to prepare for the moment when we will have to accept the inevitable loss of information. Of course, they focused at first on the technical aspects, seeing the social preparations as something of lesser importance, that could be tackled at a later time.

Instead of creating new compression algorithms or storage technology, they analyzed what types of information people were creating. The bulk of it came from videos, photographs or other recordings of the real world, they noticed, and this type of information has lots of redundancies, or repeated parts. As an example, imagine a group of tourist visiting a city, cameras and phones in hand: their photographs and videos will have lots of similarities, because they are targeting almost the same things, at almost the same time, in almost the same place.

This presented an opportunity for optimization: if we knew how the photographed objects should look, and we knew exactly when, and from where the pictures were taken, we could reconstruct the photos from that information. It's like asking for "a photograph of Machu Picchu, as seen from exactly this place, at this time of the year, with this light", and obtaining a pixel-perfect image of the world heritage, including the llamas.

By the early 2010s and 2020s, technologies that allowed exactly that started to emerge. They were based on Machine Learning algorithms, fed with images found on the Internet, and could answer requests such as "create a picture of a cat with an astronaut helmet". But they didn’t know whether a particular cat had been in a certain place, at a certain time, nor whether it had an astronaut helmet or not. They just made things up, based on the pictures it was fed. “Show me the cat in my balcony, yesterday at 7 pm” meant nothing to these algorithms, they would just produce a picture of a random cat, at a random balcony, because they lacked a proper model of the world.

Their big insight that these scientists and engineers had, was to build a model of the world using exactly all those photos, videos, audio recordings that people were making all the time. Instead of just storing them, both their data (colors, sound) and metadata (things like the time and place of creation, or who created them) could be used to build and keep updated a model of the real world. After all, vast regions were being recorded continuously by all sorts of devices, from the microphones in mobile phones, to self-driving cars' LiDAR scanners and cameras in “smart” doorbells and TV sets. After updating the model with this data, they could then use the metadata to reconstruct the original content. For example, if I take a hundred pictures of my cat in my balcony, asking the model to “show me the cat in my balcony, yesterday at 7 pm” had a much more precise meaning.

They started with the most densely populated areas of the world, where most information was available, and expanded to less populated areas thanks to satellite images and autonomous drones. The initial precision left much to be desired, but it improved fast. And the bigger the model became, the more useful and valuable it was, which fueled its expansion even more. After some decades, it covered more than 99% of the surface of the earth.

Soon it was possible to use this virtual model to find out about anything happening anywhere. It became so complete, that nobody found it necessary to use any other source of information. The people started calling this virtual model simply History, and anything recorded before, or outside of it, became protohistoric.

Even before History, people had gotten used to storing all their information “in the cloud”, and being able to access it at all times. Asking them to erase their childhood memories because they ran out of space, or, even worse, deleting them without their permission, was seen as borderline heretic. But there was no way around it, the world had to prepare for a future with scarce storage.

What if people didn't engage voluntarily in such digital declutter? Who gets to decide, then, what is deleted, and how is that decided? This wasn’t seen as an issue for some time, but after some decades, when it became clear that peak storage was arriving sooner than expected, governments were forced to react. International committees were created to decide on these issues, and several countries imposed a “citizen storage-quota”, regulating how much storage a person was entitled to. Storage poverty became an issue, and storage bytes, the new currency.

The right to be forgotten

History would not be worth much, if it didn't register everything, everywhere. That included, of course, people.

While being seen and heard all the time was for some a dream come true, for others, it was hell. The Human Data Corporation added some privacy protection to History, but it only restricted its replaying, that is, looking at it; they did not limit who gets registered. Everybody was included.

Parts of the population protested against their every action, word, and movement being recorded for posterity. As History expanded its geographical reach, and more people were captured by it, these protests turned into resistance. Civil rights movements mobilized thousands in capitals worldwide, while activists engaged in acts civil disobedience, like gluing themselves to servers in data centers, or hacking into them and deleting parts of History.

But those were just minor setbacks. Governments were concerned about the potential impact that an incomplete History would have on the economy, so instead of defending the rights of citizens to privacy, they repressed the protests.

Violent clashes led to the establishment of “no recording zones”, semi-autonomous regions where no cameras or mobile devices were allowed, and anyone seen using them was automatically attacked. Mainstream society started despising these regions and their people, calling them brutes and Luddites, and the term protohistoric acquired derogative connotations. But the mocking only reinforced their belief that they were on the right path, and the protohistorians amalgamated into a unified front. After some decades of fighting, they managed to establish the so called “reserves”, remote areas where they would not be registered. They were finally freed from History.

Hundreds of thousands moved into these reserves, from everywhere. They had very diverse backgrounds, and were forced to develop new forms of economic and political organization to survive, as the old ways could not satisfy them anymore. Although they were disconnected from “the outside” and didn’t use History, they did not reject technology entirely. They just wanted the right to be forgotten.

Lagging behind reality

The expansion of History didn't happen without glitches. During its first years, the Human Data Corporation found itself unable to fulfill its promises of worldwide reach. It turned out that their software developers had been too optimistic, underestimating the processing power needed to build History. Immense data centers, designed to process the information from entire continents, were struggling to deal with a handful of cities.

In normal circumstances, this would have been solved by brute force, that is, by adding even more computers to the problem. But a microchip shortage, one additional consequence of union strikes and protests against History, made that option impossible. Investors and government agencies demanded expansion, but the Corporation couldn't do that, not at least without adding increasing delays into History.

To understand why, let's imagine a train station where arrivals happen at regular intervals of time. As long as each train leaves before the following one arrives, everything works fine. But if a train takes too long at the station, the following one will be delayed, as it must wait for the previous one to leave. If this happens only once or twice, some trains will miss their scheduled times, but later ones might catch up and avoid further delays. But if every train starts taking longer than expected at the station, then the delays will add up. The first ones, by just a few minutes; the following ones, by an even longer time; and by the end of the day, the trains might be delayed by hours. Eventually, trains will be cancelled, or the schedule adjusted to allow for longer times at the station.

The data centers worked similarly to a train station. Information arrived continuously, in the form of videos, photos or audio, was turned into knowledge of the world by the algorithms, and fed into History. If the processing took too long, delays would start to add up. But unlike the train station, it was not possible to drop "trains" of information, as that would create holes in the knowledge within History. It was also impossible to "adjust the schedule", because information never ceased arriving. The processing had to be done as soon as the information arrived.

Unless the Corporation managed to speed it up, it had to choose between two options: either be stuck at the first step of their promised worldwide expansion, or write a History that lagged behind reality.

Lost in dogmatic deafness

Letting History lag was out of the question, which meant that the Corporation had to shelve their expansion plans for a while. The financial markets reacted in the way that they usually do: downgrading the Corporation's debt, sinking its market valuation, and dragging down with them a bunch of other companies. Investors were not happy, and neither were governments, who had had enough with the social upheaval brought about by History, and the last thing that they wanted now, was an economic recession. They distanced themselves from the mess, threatening the Corporation with legal action if they did not fulfill their expansion commitments.

Just when the people above were writing down History as history, a ray of hope shone on them from the depths of the research lab. Instead of refining the algorithm by hand, a time-consuming task which nobody enjoyed much, they could train it to optimize itself. After all, optimizing is a form of learning, and that's what those algorithms were built for. The trick was to change what they learned, so that in addition to acquiring facts about the world, they would come up with better ways to process and store those facts. This meta learning promised to increase the efficiency of the system much faster than what the manual tinkering of overworked engineers could.

Initial experiments went well, and the new code reached speeds that, hopefully, would be enough for real time processing. But to achieve that, it had to modify the structure of the information in ways beyond the engineers' comprehension. This meant that if something went wrong, recovering the information in History might be close to impossible.

The more conservative higher-ups refused to go in that direction, and pushed for business as usual, recruiting more engineers to - maybe - improve efficiency by marginal amounts. As long as we get investments, they said, we can grow. On the other hand, the self-labeled disruptors wanted to move fast the new algorithm into production, accepting the risks as an unavoidable business reality. Why wait, they argued, if we can fix things as they break. Any other proposal was lost in the dogmatic deafness of these arguments.

The Corporation was torn apart. The disruptors managed to convince some key investors and gained control of the data centers, and therefore of History. But whether that was a victory or not, was debatable, as the conservatives and their allies retained the Corporation's name, most financial assets and all major business operations. This left the disruptors with few resources to keep History running. Unless their idea worked out, they would find themselves in even more trouble than before.

The language of the creators

Within a week, the victors had rebranded themselves as The History Company and introduced their new algorithm, "the only viable option" for the preservation of History. They assured governments that there was nothing to worry about, managed to bring some new investors on board, and pushed the new code into production.

Unfortunately, the speed of the new algorithm couldn't equal that of their public relations. The optimizations were well below expectations, leaving engineers scratching their heads and the public unimpressed. History could now reach almost all of North America and Europe, but adding more regions would overtax the servers. Coverage of Africa, Asia and South America was left on indefinite hold. Only Antarctica, where not much was happening beyond breaking ice shelves and rapidly-melting glaciers, was added to the expansion plans. "Never forget the environment" was the motto, and it included historized polar bears and penguins. This was no more than a publicity stunt and didn't placate anyone, except for a few oceanographers and climate scientists.

As if putting off the most populous parts of the world wasn't enough to create animosity against them, History's support for non-European languages was abysmal. Speech recognition, the capacity for turning spoken words into written text, was important both for recording the world (or "historizing" it), and for accessing the information (or "replaying" it). Major European languages worked well, but History struggled to understand almost anything else. Replaying any part of history containing badly-supported languages resulted in garbled up audio, or the reproduction of entirely wrong words. It was clear that to be properly remembered by History, one had to speak the language of its creators.

Irritated, India and China joined forces to work on their alternative History, one where their peoples and languages were first class citizens. Other countries joined their effort, and soon the so-called "Asian History", an answer to the North-Atlantic hegemony over human knowledge, went live.

This was not met without controversy. At first, its need was questioned purely on technical grounds, and scientific journals were filled with papers claiming the supremacy of one version of History over the other. But that was just a facade, and soon debates moved from computer science conferences to political forums. Western governments felt it necessary to intervene and publicly support The History Company. Technological dominance was a part of it, but their main concern was controlling which version of History would be taught to future generations.

The company was happy to benefit from the resources that came with that assistance which finally, albeit slowly, allowed it to increase their reach worldwide. In exchange, they worked together with development cooperation agencies to promote digital infrastructure projects in Africa and the rest of the Americas. Asian governments didn't stay far behind, offering assistance in their own ways. Promises of lifting people from their "digital poverty" were soon made and agreements with local governments signed. Cameras were sprinkled over all major cities. Personal communication devices, designed to continuously historize everything in their surroundings, were handed out to almost anyone who asked for them. Natural resources were extracted to build new datacenters that promised many new jobs, but employed very few people. The era of centralizing the World's information had started.

Live your History, live

While this East-West schism shook geopolitics and the History Company pursued their quest to historize the world, the Human Data Corporation was trying to figure out its own future. They didn't want to completely depend on their former colleagues, but still believed that History would revolutionize the world, becoming the main storage of human information and the principal way to consume and create media.

That centralization led to the disruption of most existing forms of media. Music, movies, broadcast or streaming; everything would be generated and consumed inside History. Media companies who could not adapt started disappearing, leaving behind huge amounts of content which the Corporation bought and moved into a Protohistoric Archive, the biggest collection of information outside of History. Whether it was digital, magnetic, optical or physical media, it didn’t matter; everything was archived and, most importantly, not historized. Becoming the gatekeepers to old-media was the first part of their strategy, as they speculated that protohistoric information would only increase in value with time.

The second part consisted in turning everyone into a content creator. As History took over the world, suddenly everyone became content without realizing it: just walking into a public space made you, and whatever you did, immediately available for the whole world to see. Under the motto "live your History, live", the Corporation offered everyone the possibility to be curators of moments and places. You could be in the middle of conflicts and report news in real time; engage in (premeditated) random acts of kindness with strangers; kill animals, other people or yourself; propose to your loved one in front of the whole world (and be rejected); engage in your favorite BDSM acts; do nothing at all, or just react to people doing all of the above. People could now enhance, tag, post-produce and forever historize any of their actions and, most importantly, monetize them. Of course, with a slice of those earnings going to the Corporation.

(As an interesting side effect, deep fakes, which had become a serious problem by then, stopped being an issue almost overnight, as people questioned anything that they could not check through History. Is a tree really felled, if the fall was not historized?)

Of course, these developments didn't come without issues. Many people were not happy about appearing in public History without having agreed to it, and being erased was simply not an option. Privacy controls were very rudimentary, and neither the History Company, struggling with their finances, nor the Human Data Corporation, profiting from content creation, were willing to invest much to change that.

As Anti-History sentiment grew. Social organizations demanded government intervention, but western governments were in a delicate position: if they agreed to the demands for stronger controls, they could not continue criticizing its Asian rival, which they had labeled as anti-democratic and censored. Their hesitance led to increased resistance and violence from the protohistoric movements.

Publicly, the Human Data Corporation and The History Company blamed each other for the situation. In private, they were tending bridges and reaching secret non-compete agreements. After all, they still shared some core investors, and all that really mattered was keeping their shareholders happy.

Delayed law enforcement

While many enjoyed broadcasting every second of their lives, others dreaded being unknowingly historized. And rightly so: cases of pedophiles, stalkers, and thieves using History became increasingly common. The media was obsessed with it, and spared no occasion to attack, mock or criticise History. From newscasts to documentaries, to talk shows, History was everywhere. There was even a reality show that invited couples with a made-up excuse, then showed them historized clips of one cheating on the other. It included a lawyer on set, who had prepared the divorce papers for them to sign live. All of it was historized, of course.

But whether they were honest, or just fighting a corporate war against The History Company out of fear of obsolesence, the risks were real and in everyone's minds. Not by chance, Merriam-Webster's word of the year was Histalker, a portmanteau of History and Stalker.

As fears grew, the idea of looking for crime using History - or, as protohistorians said, government histalking - grew too. The potential for law enforcement wasn't hard to see: it could ease detective work, track down suspects, and even provide proof of their crimes, as long as judges were willing to accept replays as evidence in court.

Meanwhile, the History Company had been working with secret services for some time. The agencies installed computers in History's data centers, black boxes that intercepted all internal network traffic and sent it back to their headquarters. The company was not told what they sent and why, but internal security experts assumed that they read, saw, and heard everything. The government kept this collaboration secret for 'reasons of national security,' with the unfortunate side effect that local law enforcement agencies could not use the information they gathered. From the perspective of the public, the government was doing nothing to protect them.

Some city governments bent under public pressure and ordered their police departments to use History for their detective work, even without a clear legal framework supporting them. Huge rooms full of outsourced workers started monitoring History for signs of crime 24/7, not unlike the operators looking for illegal material in social networks.

This placated mainstream public opinion at first. But small budgets limited the number of agents monitoring History, so many crimes went unseen. And even when they found one, resources were often insufficient to deal with it. That left them in the uncomfortable position of being unable to act on known crimes.

Public dissatisfaction grew again. Soon, groups of civilians organized private "crime searches" using History. In no time, citizen militias appeared in every major city to take crime-fighting into their own hands. Of course, this fired back at the History Company. To control the damage, they added a delay of twenty-four hours to replays. In other words, people had to wait a whole day to see what they had just historized. Only police forces and government agencies could replay History without delays. It helped, but again, just a little.

The Human Data Corporation denounced those delays as a direct attack on their content creation business, especially their recently launched tools for managing live audiences. After much negotiation, they reached an agreement: they would support the development of tools for automatic crime detection in exchange for excepting 'certified' History celebrities' replays from the delay. Of course, certification was an expensive process, and it soon became an important income source for both companies.

While they waited for the development of those tools, police forces were still understaffed, and most historized crimes remained unpunished.

Machine Prediction

Police departments were in a difficult situation. Forced to monitor the streets using History, they suddenly witnessed every single time someone broke the law in public. They could not ignore what they saw because everyone else had the same access to History and could see it too. But the bureaucracy that came with dealing with every single petty theft or minor contravention was just too much.

It took no time for the tech sector to step up and offer a solution. Using Artificial Intelligence (AI), they promised to detect those "minor inconveniences" that drove the police crazy, identify the violators, and start legal processes automatically. Although automating law enforcement was enough to scare the most law-abiding citizens, the Faustian bargain did not stop there: the companies also promised to predict when crimes would happen.

Their solution was to use Machine Learning (ML), a branch of AI based on learning patterns from massive piles of data and using those patterns for recognizing objects, words, or people. Sometimes these systems are taught to identify specific things. For example, a system trained with millions of pictures of dogs can tell if another picture, which it has never seen, contains a dog. Others are generative: when asked for a picture of a dog, they create a new image with one (or at least something resembling a dog: it might have five legs or two tails, but it will still have a certain "dogness" to it.)

Those use cases (image recognition and synthesis) were, together with text composition and speech recognition, some of the most popular and advanced forms of ML until then. How can such a system detect crime?

Those systems work by predicting the correct answer to a question. They compose text by predicting one word at a time. They show photos of puppies by predicting the value of their pixels, one at a time.

As an example: which word should follow in the sentence "The fisher caught a"? Most people will guess "fish", or the name of a fish species (some cynics might suggest a plastic bag or a shoe). We can say that because we know something about fishing and what inhabits bodies of water. But a Machine Learning system knows nothing about the world. Instead, after reading millions of sentences, it has learned which words are most likely to appear after others. The word “fish” means nothing to the ML system but, statistically speaking, it is the most likely one to follow "the fisher caught a." Using this knowledge, a system can write a sentence starting with "The fisher" by predicting that the next words are probably “caught”, “a”, and finally, “fish”. It just creates a sequence of words, one by one.

The same technique can predict crime. Instead of training with words, the system trains with actions. Imagine the following: a person parks a car in front of a bank, grabs a gun, gets out of the car, and enters the bank. What is the following action in that sequence? Anyone who has watched enough Hollywood movies can assume they will rob the bank, enter the car, and escape.

A system for detecting crime would not train by watching films but with replays from History. And like the system that could guess the word "fish", after watching enough replays, this one could predict that the person with the gun will rob the bank. Of course, this required a lot of work. The algorithms needed to be modified to train from History. The training data had to be collected and annotated with additional information (like the word "dog" in the pictures of dogs).

Until then, crime prediction algorithms had found two main uses. First, to predict which parts of a city had a higher likelihood of crime, allowing police departments to use their resources efficiently, patrolling some areas more than others. Second, to score people to determine whether they were likely to commit crimes in the future. Judges used those scores to rule on fines and bail. Both cases suffered from bias: among other issues, police would patrol predominantly poor neighborhoods, and judges would underestimate the risk of white people and overestimate that of people of color.

Tech companies claimed their systems would have no bias because as History contains "all the information on everything happening all the time," everyone would be treated equally. Before there was time to prove them wrong, something else happened: someone stole and leaked the trained models.

The models are where the Machine Learning algorithms store what they learned. They result from thousands of hours of training using powerful computers and millions of examples. It is usual for companies to discuss and show their algorithms publicly. Models are usually not released, as the cost of creating them is high.

While other companies could not use the stolen models, a community of researchers and enthusiasts soon created others that mimicked their capabilities. These models also allowed fine-tuning. For example, a model that recognizes people's behaviors and actions can be refined for sports, dance, or other human activities. These new, modified models were legal but still needed access to History to make predictions. The History Company saw the business potential and opened up metered access, letting anyone create predictive systems as long as they paid for the data. Every company, from insurance brokers to fashion brands, scrambled to find ways to add predictions to their services. Nobody wanted to be left out.

A problem with these models was that they used substantial amounts of computing power and energy. Big companies could afford to run them, but smaller ones and non-profit organizations could not. A second wave of models that traded off computing power for precision soon emerged.

These cheaper models opened the gates to, among others, vigilante groups. History terms of use forbade using replays to track people, so vigilantes trained their systems with illegally downloaded History replays. These focused on what they believed was suspicious behavior, which meant, among other things, marking anyone seeking anonymity from History as potentially dangerous. The conflict between historized and historized-nots was heating up.

Historized and Historized-nots

Less than two years after its introduction, History profoundly changed the lives of billions worldwide. East and West found another reason for disagreement, as each fought for their respective versions of History. In Europe and North America, and less so in other historized parts of the world, commercial exploitation of replays left no business unchanged. Everyone suffered from fear of missing out and looked for ways to gain business advantages using History. The consensus on privacy, personal freedoms, and human rights shifted. As legislators, law enforcers, criminals, conspiracy theorists, vigilantes, and anonymity seekers clashed public- and privately, everyone else was caught in between. The greatest conflict of all, the division between historized and historized-nots, was slowly becoming unsustainable.

For some, it was about their privacy rights. For others, about how a technology over which they have no control drastically affected their lives. And for all of them, about how society discriminated — even attacked — anyone who refused being historized.

The 24-hour delay in public replaying did not stop vigilante and paramilitary groups. Their algorithmic predictions had delays but could still be used to follow and ambush their prey. It sufficed to predict when someone would leave their homes or workplaces and to wait for them. Not even those living in no-recording zones were safe as long as they left them regularly, as Machine Learning models could tell when and where to find them. The prediction could be imprecise, as even a tiny success rate was better than waiting outside all day for a target to walk by. The only protection against that was to become unpredictable, to provide no pattern for the machine to learn from.

Work changed too, in particular recruiting practices. Using an algorithm to check a curriculum was a thing of the past. HR departments now run regular History checks. They could quickly inspect whatever they thought was important in an employee. Family values, sexual orientation, community engagement, lifestyle, consuming patterns, or reckless driving are just a few examples of what they could easily find. Of course, this made non-historized people suspicious. After all, if they had nothing to worry about, why hide?

Of course, this didn’t apply equally to everyone. Those who wanted privacy and could afford it visited the “Faraday clubs,” also called Faraday bars. They were usually built underground and protected by one or more Faraday cages: metal meshes that could block electromagnetic waves from entering or leaving. In other words: no communication with the outside world. They did not allow any recording or electronic devices and were typically member-only. Some appealed to those craving different sexual experiences without fear of being historized. Others appealed to business people, offering meeting rooms, catering, and other services. Most were expensive and frequented only by affluent people — and their friends.

Of course, everyone could build their own Faraday cage in their cellar, and many people did. But as always, the difference between “seeking privacy” and “suspicious of hiding something” was measured by your wallet.

Predicting the past

Retro-predictions were one of those ideas that appear seemingly out of nowhere and are so obvious, that it is surprising nobody thought about them before.

Traditional – or rather forward – predictions guess what might happen. The algorithm trains with billions of historized events, finding the common patterns between them and how events typically follow each other. It then uses those patterns to figure out what could happen after a particular situation.

Retro-predictions work similarly, but the algorithms train with sequences of events in reverse order. Instead of learning that someone leaves their home, then walks to the train station, then takes a train, the algorithm learns that someone takes a train after they walk to the station after they leave their home. Because the algorithm ignores the meaning of those actions, the order in which they occur is almost irrelevant; whether it knows that X happened before Y or that Y happened after X, the end result is the same. But training with events in forward order leads to forward-predictions, while learning in backward order leads to retro-predictions.

At first, the predictions from the algorithm were not very precise. Neither was it possible to use it everywhere. History had not been created with retro-predictions in mind, so its capacity to fill in holes in its information by looking backward was limited: It could barely see for a few seconds before any event, and the predictions were not especially good.

But even with those limitations, retro-predictions had a profound effect on people. They challenged the idea of time as an arrow flowing in one direction, where things in the future always depended on those from the past. If we could see events in the future as defining those in the past, what was the real difference between them both?

While philosophers spent their days discussing the essence of time, governments invested heavily in the new algorithms. In the West, by increasing their support for the History Company and the Human Data Corporation, which, at this point, had stopped fighting in favor of filling their coffers with public money. In the East, governments worked hard to ensure that predictions of the past could only show officially-sanctioned historical facts.

Among those interested in retro-predictions were law enforcement agencies, salivating at the idea of more algorithmic case-solving. They imagined a future where they could see the past, no matter how well someone buried it, and they set up labs to experiment with the new technology. Soon they came up with different ways to improve the predictions, for example, by combining them with their software for identifying individuals. This mash-up of predictive technologies increased their precision to uncanny levels.

While some welcomed the possibility of solving crimes without needing reliable witnesses or direct recordings, others grew restless. A group of civil rights organizations formed the Freedom to Forget coalition to prevent such a future from materializing. But the fear of an algorithmic police state drove thousands from the cities into low-population, low-historized areas.

As anthropogenic climate change made vast regions of Siberia habitable, groups of pioneers founded some of the first historized-free territories there. The settlements didn't last long, as the permafrost's melting released huge amounts of Anthrax into the air, and most settlers died after only a couple of years. Others managed to build permanent settlements in northern Europe, especially in the fjords of Norway. Patagonia was another destination of choice, where it was possible to find locations hundreds of kilometers apart from existing settlements.

The privacy of these refugees, the first real proto-historics, was guaranteed for a while.

Breaking free

middle class people flee from no-recording zones, as they turn dangerous for life and career. First no-recording free zone established in the middle of ukraine by a group of anarchists. “I’d rather glow in the dark, than show up in your computer screens.”

Worldwide expansion

Both Histories continue to expand worldwide. The unavoidable advance of progress.

Clean energy for historizing displaces uses to support communities. Where will all the additional electricity for all those recording devices and data centers come from?

Something… changes

How does Gugu come to be?

The North-korean hacker.

Tying up loose strings

The language of the protohistorics