summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.pngbin0 -> 1957 bytes
-rw-r--r--design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.pngbin0 -> 2919 bytes
-rw-r--r--design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html129
-rw-r--r--index.html3
4 files changed, 132 insertions, 0 deletions
diff --git a/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png
new file mode 100644
index 0000000..4f7f4b0
--- /dev/null
+++ b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png
Binary files differ
diff --git a/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png
new file mode 100644
index 0000000..51608f5
--- /dev/null
+++ b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png
Binary files differ
diff --git a/design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html
new file mode 100644
index 0000000..1651cde
--- /dev/null
+++ b/design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html
@@ -0,0 +1,129 @@
+<!doctype html>
+<html lang="en">
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<meta name="author" content="aki">
+<meta name="tags" content="programming, software design, interface design, UNIX philosophy">
+<link rel="icon" type="image/png" href="cylo.png">
+<link rel="stylesheet" href="style.css">
+
+<title>Design Simple Interfaces: How to Store Files in Arbitrary Memory</title>
+
+<nav><p><a href="https://ignore.pl">ignore.pl</a></p></nav>
+
+<article>
+<h1>Design Simple Interfaces:<br>How to Store Files in Arbitrary Memory</h1>
+<p class="subtitle">Published on 2022-01-25 23:45:00+01:00
+<p>This one comes as a little surprising and for experienced (or even intermediate) reader may appear boring. Think of
+it as "back to the fundamentals" type of article. Now that I think about it... Do I even have anything that wouldn't
+classify into this category? Oh, yeah, rants.</p>
+<img src="design_simple_interfaces_how_to_store_files_in_arbitrary_memory-1.png" alt="bird">
+<p>I had an opportunity to give some lecture over at the local university about programming and one of the topics was
+loosely related to persistence. We were playing around with a <a href="https://git.ignore.pl/hwd/hwd/">hardware
+simulator</a> and I asked a question that we could use as a plot device:
+<p><strong>What is the simplest way to put several files into an arbitrary memory?</strong>
+<p>Now, that probably wouldn't be a blog post if it wasn't for that not only the students were surprised, but also my
+colleagues at work.
+<h2>How to Store Something in Memory?</h2>
+<p>We were using a simulator that I prepared for the lectures. It has a rather modern and straight-forward interface for
+writing data into the memory:
+<pre>
+bool hwd::memory::write(std::vector&lt;char&gt; data, std::size_t offset);
+</pre>
+<p>Usually you will probably use something more similar to:
+<pre>
+ssize_t write(int fd, const void * buf, size_t count);
+ssize_t pwrite(int fd, const void * buf, size_t count, off_t offset);
+</pre>
+<p>Where regular <code>write</code> can be offset by using <code>lseek</code> on the file descriptor.
+<p>Whatever the case it is for you - it's almost given that there will be at least a little bit of similarities. I'll
+continue to use my simulator as example - it's simple, you can build it yourself, and it should be easily translatable.
+<p>Let's use it! We need some data and the just push it to the memory:
+<pre>
+std::vector&lt;char&gt; data {'H', 'e', 'l', 'p', 0};
+bool ok = hwd::memory::write(data, 0);
+</pre>
+<p>Nice, we wrote our call for help to the memory with a terminating null byte. How does one receive it?
+<h2>How to Read Something From Memory?</h2>
+<p>There are similar functions provided:
+<pre>
+std::vector&lt;char&gt; hwd::memory::read(std::size_t length, std::size_t offset);
+ssize_t read(int fd, void * buf, size_t count);
+ssize_t pread(int fd, void * buf, size_t count, off_t offset);
+</pre>
+<p>For any questions refer to your friendly manual page. If you are not familiar with this convention: in
+<code>read</code> and <code>pread</code> the content read from the memory is written to the <code>buf</code> that was
+passed as argument and the function result is just the count of bytes that were read.
+<p>With this, we can:
+<pre>
+auto data = hwd::memory::read(5, 0);
+</pre>
+<p>But that's only because we know the length of the data before-hand. And the truth is - program needs to know
+something before it starts reading and then using the data. That something is usually a standard - memory layout or a
+format. In this case, we could assume: the program needs to store only one null terminated string. This kind of
+knowledge would be enough for the program to write and read arbitrary strings as long as they are null terminated.
+<p>Since we are using strings as an example, there is also another implementation of them: length coupled with an array.
+This kind of format allows the string to contain null bytes which is convenient when storing sightly more arbitrary
+data.</p>
+<img src="design_simple_interfaces_how_to_store_files_in_arbitrary_memory-2.png" alt="i/o">
+<p>Assume that you know the size of the memory - in case of this simulator it's 10KB - figure out the number of bytes
+you need to use to store the maximum length of the data (2 bytes here), and then just say that there is an integer on
+first X bytes of that memory. Write the length of data to these bytes when writing data. Read the length of data from it
+and then read that many bytes from memory that follow it.
+<p>More or less it looks something like this:
+<pre>
+if (data.size() < 9998) {
+ prepend_uint16_to_vector(data, static_cast&lt;std::uint16_t&gt;(data.size()));
+ hwd::memory::write(data, 0);
+}
+</pre>
+<p>Where <code>prepend_uint16_to_vector</code> does exactly what you can expect from it in a known way (e.g., endianness
+is consistent between platforms). Then to read reverse the process:
+<pre>
+auto data = hwd::memory::read(2, 0);
+auto length = read_uint16_from_vector(data);
+data = hwd::memory::read(length, 2);
+</pre>
+<p>In this case <code>read_uint16_from_vector</code> also does what you can expect from it. Note that this samples does
+two reads in total: first to get the length, second to get the data of given length.
+<h2>How to Do the Same With Multiple Files</h2>
+<p>Now you might start thinking how do you apply this knowledge to store multiple independent files. The thing is, you
+are already there. Well, almost. First, allow the reader to initialize the data from standard input:
+<pre>
+std::vector&lt;char&gt; data {std::ifstreambuf_iterator&lt;char&gt;{std::cin}, {}};
+</pre>
+<p>Of course, there are better ways to do it, but for the sake of the length of this post, let's not discuss them. In
+case this looks like elvish to you - trust me, this reads standard input to <code>data</code>. After you have this,
+the process remains the same: get the length, put it on two first bytes of the memory, and then put the data itself.
+<p>You have the input stream saved in memory. Now read it using our method and then output it to the standard output
+stream:
+<pre>
+for (const auto & byte : data)
+ std::cout << byte;
+</pre>
+<p>Or using any other method as long as it doesn't create any clutter.
+<p>By now you either got it or you started wondering where it is going. We are not storing any files in the memory!
+I'm lying! That's right accusation. We are not doing it but we created an interface that allows to do it. Think about
+it:
+<pre>
+$ tar cf - directory/ | ./our_memory_write
+$ ./our_memory_read | tar tf -
+directory/file_a
+directory/file_b
+</pre>
+<p>Ha! Assuming you have enough space you can even create image, format it, and then write to memory:
+<pre>
+$ fallocate -l 9998 filesystem.img
+$ mkfs.bfs filesystem.img
+$ ./our_memory_write filesystem.img
+</pre>
+<p>Quite satisfying, but that's not the point of this post.
+<h2>Design Simple Interfaces</h2>
+<p>This isn't about UNIX philosophy. This isn't about standardization of the entire world. No. It's all about you, your
+creations, and things that you integrate with. And it's sounds all hippie-dippie-unicorns, but that's just how it is.
+Think about what you are building. Think about what you will need and what you will integrate with. Find common generic
+interfaces that will last and build upon them. This method of saving files into memory works because it builds on a
+well-established and incredibly simple convention. Look at the all-popular Web APIs, they built on top of HTTP so well
+that it's hard to believe that there are any other protocols out there in the net.
+</article>
+<script src="https://stats.ignore.pl/track.js"></script>
diff --git a/index.html b/index.html
index 54107ca..dbe28ef 100644
--- a/index.html
+++ b/index.html
@@ -32,6 +32,9 @@
<section id="blog">
<h2>Blog</h2>
<ul>
+<li> <a href="design_simple_interfaces_how_to_store_files_in_arbitrary_memory.html">Design Simple Interfaces: How to
+ Store Files in Arbitrary Memory</a><br>
+ <time>2022-01-25</time>
<li> Once again I'm breaking this place apart. For now I separated main index from the blog and reverted back to
date-based indexing for the blog. I don't quite like the looks of it just yet, so consider this an ongoing
effort.<br>