playing_around_with_simple_interfaces.html


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141

<!doctype html>
<html lang="en">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="author" content="aki">
<meta name="tags" content="programming, software design, interface design, UNIX philosophy">
<meta name="published-on" content="2022-01-25T23:45:00+01:00">
<meta name="last-modified-on" content="2022-01-26T18:44:00+01:00">
<link rel="icon" type="image/png" href="favicon.png">
<link rel="stylesheet" href="style.css">

<title>Playing Around With Simple Interfaces</title>

<header>
<nav><a href="https://ignore.pl">ignore.pl</a></nav>
<time>25 January 2022</time>
<h1>Playing Around With Simple Interfaces</h1>
</header>

<article>
<p>This one comes as a little surprising and for experienced (or even intermediate) reader may appear boring. Think of
it as "back to the fundamentals" type of article. Now that I think about it... Do I even have anything that wouldn't
classify into this category? Oh, yeah, rants.</p>
<img src="playing_around_with_simple_interfaces-1.png" alt="bird">
<p>I had an opportunity to give lectures at the local university about programming and one of the topics was loosely
related to persistence. We were playing around with a <a href="https://git.ignore.pl/hwd/hwd/">hardware simulator</a>
and I asked a question that we could use as a plot device:
<p><strong>What is the simplest way to put several files into an arbitrary memory?</strong>
<p>Now, that probably wouldn't be a blog post if it wasn't for that not only the students were surprised, but also my
colleagues at work.
<h2>How to Store Something in Memory?</h2>
<p>We were using a simulator that I prepared for the lectures. It has a rather modern and straight-forward interface for
writing data into the memory:
<pre>
bool hwd::memory::write(std::vector&lt;char&gt; data, std::size_t offset);
</pre>
<p>Usually you will probably use something more similar to:
<pre>
ssize_t write(int fd, const void * buf, size_t count);
ssize_t pwrite(int fd, const void * buf, size_t count, off_t offset);
</pre>
<p>Where regular <code>write</code> can be offset by using <code>lseek</code> on the file descriptor.
<p>Whatever the case it is for you - it's almost given that there will be at least a little bit of similarities. I'll
continue to use my simulator as example - it's simple, you can build it yourself, and it should be easily translatable.
<p>Let's use it! We need some data and the just push it to the memory:
<pre>
std::vector&lt;char&gt; data {'H', 'e', 'l', 'p', 0};
bool ok = hwd::memory::write(data, 0);
</pre>
<p>Nice, we wrote our call for help to the memory with a terminating null byte. How does one receive it?
<h2>How to Read Something From Memory?</h2>
<p>There are similar functions provided:
<pre>
std::vector&lt;char&gt; hwd::memory::read(std::size_t length, std::size_t offset);
ssize_t read(int fd, void * buf, size_t count);
ssize_t pread(int fd, void * buf, size_t count, off_t offset);
</pre>
<p>For any questions refer to your friendly manual page. If you are not familiar with this convention: in
<code>read</code> and <code>pread</code> the content read from the memory is written to the <code>buf</code> that was
passed as argument and the function result is just the count of bytes that were read.
<p>With this, we can:
<pre>
auto data = hwd::memory::read(5, 0);
</pre>
<p>But that's only because we know the length of the data before-hand. And the truth is - program needs to know
something before it starts reading and then using the data. That something is usually a standard - memory layout or a
format. In this case, we could assume: the program needs to store only one null terminated string. This kind of
knowledge would be enough for the program to write and read arbitrary strings as long as they are null terminated.
<p>Since we are using strings as an example, there is also another implementation of them: length coupled with an array.
This kind of format allows the string to contain null bytes which is convenient when storing sightly more arbitrary
data.</p>
<img src="playing_around_with_simple_interfaces-2.png" alt="i/o">
<p>Assume that you know the size of the memory - in case of this simulator it's 10KB - figure out the number of bytes
you need to use to store the maximum length of the data (2 bytes here), and then just say that there is an integer on 
first X bytes of that memory. Write the length of data to these bytes when writing data. Read the length of data from it
and then read that many bytes from memory that follow it.
<p>More or less it looks something like this:
<pre>
if (data.size() < 9998) {
	prepend_uint16_to_vector(data, static_cast&lt;std::uint16_t&gt;(data.size()));
	hwd::memory::write(data, 0);
}
</pre>
<p>Where <code>prepend_uint16_to_vector</code> does exactly what you can expect from it in a known way (e.g., endianness
is consistent between platforms). Then to read reverse the process:
<pre>
auto data = hwd::memory::read(2, 0);
auto length = read_uint16_from_vector(data);
data = hwd::memory::read(length, 2);
</pre>
<p>In this case <code>read_uint16_from_vector</code> also does what you can expect from it. Note that this samples does
two reads in total: first to get the length, second to get the data of given length.
<h2>How to Do the Same With Multiple Files</h2>
<p>Now you might start thinking how do you apply this knowledge to store multiple independent files. The thing is, you
are already there. Well, almost. First, allow the reader to initialize the data from standard input:
<pre>
std::vector&lt;char&gt; data {std::ifstreambuf_iterator&lt;char&gt;{std::cin}, {}};
</pre>
<p>Of course, there are better ways to do it, but for the sake of the length of this post, let's not discuss them. In
case this looks like elvish to you - trust me, this reads standard input to <code>data</code>. After you have this,
the process remains the same: get the length, put it on two first bytes of the memory, and then put the data itself.
<p>You have the input stream saved in memory. Now read it using our method and then output it to the standard output
stream:
<pre>
for (const auto & byte : data)
	std::cout << byte;
</pre>
<p>Or using any other method as long as it doesn't create any clutter.
<p>By now you either got it or you started wondering where it is going. We are not storing any files in the memory!
I'm lying! That's right accusation. We are not doing it but we created an interface that allows to do it. Think about
it:
<pre>
$ tar cf - directory/ | ./our_memory_write
$ ./our_memory_read | tar tf -
directory/file_a
directory/file_b
</pre>
<p>Ha! Assuming you have enough space you can even create image, format it, and then write to memory:
<pre>
$ fallocate -l 9998 filesystem.img
$ mkfs.bfs filesystem.img
$ ./our_memory_write filesystem.img
</pre>
<p>Quite satisfying, but that's not the point of this post.
<h2>Design Simple Interfaces</h2>
<p>Now, technically speaking this whole example can be limiting in couple of aspects. Primary thing you may not like
about it is that it focuses on shell usage, and for some it's an outdated approach. One way to solve it would be to
follow Hurd way and implement the interface as filesystem in userspace.
<p>However, technicalities weren't meant to be the primary topic, it just so happened because they were fun to write
about. What I would want you to understand by this example is that designing interfaces, following conventions or
standards and choosing points/levels of contacts are all important things. You may think that when you just build
single program it doesn't matter, but in such cases it matters even more, because if it happens to be used in long term,
sooner or later it will be integrated with other programs.
<p>This isn't about UNIX philosophy. This isn't about standardization of the entire world. No. It's all about you, your
creations, and things that you integrate with. And it's sounds all hippie-dippie-unicorns, but that's just how it is.
Think about what you are building. Think about what you will need and what you will integrate with. Find common generic
interfaces that will last and build upon them. Be conscious about it. This method of saving files into memory works only
because it builds on a well-established and incredibly simple convention. Look at the all-popular modern Web APIs, they
are built on top of HTTP so well that it's hard to believe that there are any other protocols out there in the net.
</article>
<script src="https://stats.ignore.pl/track.js"></script>