From 2abdaf04d0a6576172ce865979273476dbe47c2c Mon Sep 17 00:00:00 2001 From: Aki Date: Tue, 25 Jan 2022 23:43:53 +0100 Subject: Written an article with a very long name --- ...es_how_to_store_files_in_arbitrary_memory-1.png | Bin 0 -> 1957 bytes ...es_how_to_store_files_in_arbitrary_memory-2.png | Bin 0 -> 2919 bytes ...ces_how_to_store_files_in_arbitrary_memory.html | 129 +++++++++++++++++++++ index.html | 3 + 4 files changed, 132 insertions(+) create mode 100644 design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png create mode 100644 design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png create mode 100644 design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html diff --git a/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png new file mode 100644 index 0000000..4f7f4b0 Binary files /dev/null and b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png differ diff --git a/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png new file mode 100644 index 0000000..51608f5 Binary files /dev/null and b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png differ diff --git a/design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html new file mode 100644 index 0000000..1651cde --- /dev/null +++ b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html @@ -0,0 +1,129 @@ + + + + + + + + + +Design Simple Interfaces: How to Store Files in Arbitrary Memory + + + +
+

Design Simple Interfaces:
How to Store Files in Arbitrary Memory

+

Published on 2022-01-25 23:45:00+01:00 +

This one comes as a little surprising and for experienced (or even intermediate) reader may appear boring. Think of +it as "back to the fundamentals" type of article. Now that I think about it... Do I even have anything that wouldn't +classify into this category? Oh, yeah, rants.

+bird +

I had an opportunity to give some lecture over at the local university about programming and one of the topics was +loosely related to persistence. We were playing around with a hardware +simulator and I asked a question that we could use as a plot device: +

What is the simplest way to put several files into an arbitrary memory? +

Now, that probably wouldn't be a blog post if it wasn't for that not only the students were surprised, but also my +colleagues at work. +

How to Store Something in Memory?

+

We were using a simulator that I prepared for the lectures. It has a rather modern and straight-forward interface for +writing data into the memory: +

+bool hwd::memory::write(std::vector<char> data, std::size_t offset);
+
+

Usually you will probably use something more similar to: +

+ssize_t write(int fd, const void * buf, size_t count);
+ssize_t pwrite(int fd, const void * buf, size_t count, off_t offset);
+
+

Where regular write can be offset by using lseek on the file descriptor. +

Whatever the case it is for you - it's almost given that there will be at least a little bit of similarities. I'll +continue to use my simulator as example - it's simple, you can build it yourself, and it should be easily translatable. +

Let's use it! We need some data and the just push it to the memory: +

+std::vector<char> data {'H', 'e', 'l', 'p', 0};
+bool ok = hwd::memory::write(data, 0);
+
+

Nice, we wrote our call for help to the memory with a terminating null byte. How does one receive it? +

How to Read Something From Memory?

+

There are similar functions provided: +

+std::vector<char> hwd::memory::read(std::size_t length, std::size_t offset);
+ssize_t read(int fd, void * buf, size_t count);
+ssize_t pread(int fd, void * buf, size_t count, off_t offset);
+
+

For any questions refer to your friendly manual page. If you are not familiar with this convention: in +read and pread the content read from the memory is written to the buf that was +passed as argument and the function result is just the count of bytes that were read. +

With this, we can: +

+auto data = hwd::memory::read(5, 0);
+
+

But that's only because we know the length of the data before-hand. And the truth is - program needs to know +something before it starts reading and then using the data. That something is usually a standard - memory layout or a +format. In this case, we could assume: the program needs to store only one null terminated string. This kind of +knowledge would be enough for the program to write and read arbitrary strings as long as they are null terminated. +

Since we are using strings as an example, there is also another implementation of them: length coupled with an array. +This kind of format allows the string to contain null bytes which is convenient when storing sightly more arbitrary +data.

+i/o +

Assume that you know the size of the memory - in case of this simulator it's 10KB - figure out the number of bytes +you need to use to store the maximum length of the data (2 bytes here), and then just say that there is an integer on +first X bytes of that memory. Write the length of data to these bytes when writing data. Read the length of data from it +and then read that many bytes from memory that follow it. +

More or less it looks something like this: +

+if (data.size() < 9998) {
+	prepend_uint16_to_vector(data, static_cast<std::uint16_t>(data.size()));
+	hwd::memory::write(data, 0);
+}
+
+

Where prepend_uint16_to_vector does exactly what you can expect from it in a known way (e.g., endianness +is consistent between platforms). Then to read reverse the process: +

+auto data = hwd::memory::read(2, 0);
+auto length = read_uint16_from_vector(data);
+data = hwd::memory::read(length, 2);
+
+

In this case read_uint16_from_vector also does what you can expect from it. Note that this samples does +two reads in total: first to get the length, second to get the data of given length. +

How to Do the Same With Multiple Files

+

Now you might start thinking how do you apply this knowledge to store multiple independent files. The thing is, you +are already there. Well, almost. First, allow the reader to initialize the data from standard input: +

+std::vector<char> data {std::ifstreambuf_iterator<char>{std::cin}, {}};
+
+

Of course, there are better ways to do it, but for the sake of the length of this post, let's not discuss them. In +case this looks like elvish to you - trust me, this reads standard input to data. After you have this, +the process remains the same: get the length, put it on two first bytes of the memory, and then put the data itself. +

You have the input stream saved in memory. Now read it using our method and then output it to the standard output +stream: +

+for (const auto & byte : data)
+	std::cout << byte;
+
+

Or using any other method as long as it doesn't create any clutter. +

By now you either got it or you started wondering where it is going. We are not storing any files in the memory! +I'm lying! That's right accusation. We are not doing it but we created an interface that allows to do it. Think about +it: +

+$ tar cf - directory/ | ./our_memory_write
+$ ./our_memory_read | tar tf -
+directory/file_a
+directory/file_b
+
+

Ha! Assuming you have enough space you can even create image, format it, and then write to memory: +

+$ fallocate -l 9998 filesystem.img
+$ mkfs.bfs filesystem.img
+$ ./our_memory_write filesystem.img
+
+

Quite satisfying, but that's not the point of this post. +

Design Simple Interfaces

+

This isn't about UNIX philosophy. This isn't about standardization of the entire world. No. It's all about you, your +creations, and things that you integrate with. And it's sounds all hippie-dippie-unicorns, but that's just how it is. +Think about what you are building. Think about what you will need and what you will integrate with. Find common generic +interfaces that will last and build upon them. This method of saving files into memory works because it builds on a +well-established and incredibly simple convention. Look at the all-popular Web APIs, they built on top of HTTP so well +that it's hard to believe that there are any other protocols out there in the net. +

+ diff --git a/index.html b/index.html index 54107ca..dbe28ef 100644 --- a/index.html +++ b/index.html @@ -32,6 +32,9 @@

Blog