|
Objective: Understand the basic concepts of distributed systems and the role of Ice in its implementation in ArmarX.
Previous Tutorials: Create a "Hello World!" Component in ArmarX (C++)
Next Tutorials: Write a Server and Client Communicating via RPC (C++)
All problems in computer science can be solved by another level of indirection, except for the problem of too many layers of indirection. - David Wheeler
A fundamental property of ArmarX is that it realizes a distributed system. This fact has some important implications about its design and implementation. So, we think it is useful to take some time here to explain the basic concepts. But before, it is good if we understand the requirements of a framework for complex robot software.
If you are familiar with ROS (Robot Operating System), you might want to have a look at the comparison of concepts between ArmarX and ROS.
A humanoid robot's software has to continuously perform several tasks such as processing sensor data or controlling the robot's joints. Some tasks are completely independent, but some rely on the results of other tasks. Now, imagine you are implementing a cool new algorithm, e.g. for grasping an object, that should be added to the robot. Clearly, there is a need to logically structure the robot's software in a way that makes it easy to add new your algorithm.
One way to do that is to split the program logically into several workers. Each worker either continuously performs a task (e.g. processing camera images), or reacts to occurring events (e.g. a speech command). Of course, there must be a way how threads can communicate, i.e. exchange data and send notifications.
At this point, you might already have a potential solution in mind: Threads. Let us see how viable that solution is.
You may already be familiar with threads in programming. A thread is a single worker working on your code with its own control flow (e.g. flow through if/else/for/while statements) and its own local variables (e.g. in a function). You can visualize it like a thread on a needle navigating through your code.
A thread resides in a process. A process is a running program (i.e. code in execution) with its own main memory space and has at least one thread, but can contain many more. The threads in a process run concurrently, each in its own context. However, they generally share their memory space: The process has only one virtual memory address space, and it is used by all threads. Do not be confused here: A program can (and should) be written in a way that clearly separates thread-local from shared variables. But in theory, every thread could access all variables, so it is easy to exchange data between them.
In this context, we can visualize the structure of a process like this:
As can be seen, threads can communicate over the memory of the process, i.e. by reading and writing shared variables in the memory. The thick outline of the process block represents the series of strong protection mechanisms the operating system (OS) implements for processes. An important one is that one process cannot directly access the memory of another process.
Let us summarize what we discussed so far about threads:
Sounds good, doesn't it? However, there are also some significant limitations that surface when increasing the system's complexity:
So clearly, while wrapping your complete software stack in a single process has benefits, it also shows some serious limitations. How can we overcome them? One answer is, as is often the case: An additional layer of indirection.
In a distributed system, we use multiple processes to build our software stack. Imagine that each process roughly takes the role of a thread in the previous design. For example, there is one process fetching images from the camera and processing them, and another process is controlling the robot's joints. Of course, each process can still run multiple threads, but this is up to the developer of that process now, not something the framework requires.
You can visualize several processes communicating in a distributed system like this:
An important aspect is that there is no shared memory space among the different "workers" anymore. As said before, the OS provides strong protection for processes, e.g. processes cannot just directly access each other's memory, so it is more difficult to exchange data between processes than it is between threads of the same process. However, the OS also provides tools that allow processes to communicate: This is called inter-process communication (IPC). In addition, distributed systems often rely on a middleware, which allows a higher-level interface for IPC to application programmers than the OS.
The inconvenient thing is that we now have to think about how workers communicate instead of just writing and reading shared variables. The good thing is that we now have to think about how workers communicate, because it forces the definition of clear interfaces and makes communication explicit (e.g. by sending messages instead of writing a shared variable).
Let us look at the limitations we found before and see whether we improved something:
Sounds good! Of course, adding such an extra layer also has downsides. The overall system complexity will be higher, and testing can become a bit more difficult when you have to start several processes instead of one. However, we think it is useful to understand the requirements that lead to the design decision that a distributed system is adequate for a (humanoid) robot software framework.
So, let us look into the details of what this means.
In ArmarX, there are two main communication paradigms: Remote procedure calls (RPC), and topics.
This first paradigm is a client-server communication. In a nutshell, a remote procedure call (RPC) is like a regular function call, but remote, i.e. you call a function in a different process via the network.
There are at least two processes: A server and one or multiple clients. The server offers a service that clients can use. More precisely, a client can send a request carrying the input data to the server. The server processes the request, and sends back a response carrying the output data to the client.
The service offered by the server is described by an interface. For example, imagine the service of generating a random number in an interval. You can imagine such an interface like this:
RandomNumberGeneratorInterface
.generateRandomNumber()
.GenerateRandomNumberRequest
as input, which contains two integers (low
and high
) which define the allowed interval.GenerateRandomNumberResponse
containing the random number.So conceptually, it is not that different from what you might expect from a classical programming language. The difference is that this function generateRandomNumber()
can be called via the network from another process: the client.
Let us look at an example remote procedure call:
low = 1
and high = 10
.1
and 10
- say it generates the number 3
.3
.3
, and continues.Let us note some important properties of the client-server paradigm / remote procedure call:
In one of the next tutorials, we will see how remote procedure calls can be implemented in ArmarX.
The graphic above also highlights the different software layers that are involved:
- The application layer contains your "business code": This is what your application does,
i.e. the logic code that solves your actual, concrete problem. You usually want this part of the code to be independent and simple to use. That is, it should only rely on dependencies (e.g. libraries) that are necessary to solve that sub-problem, and it should use features of the programming language (e.g. overloading) to allow convenient usage.
- The transport layer fulfils the duty of transporting your data between processes,
e.g. from the client to the server and from the server to the client. It does not know anything about your application logic, except what the data and interfaces look like.
- Code at the boundary between the application and the transport layer often involves conversions
between types which are defined in the programming language and the types which are defined in the transport layer.
For example, imagine that you are dealing with mathematical 3D vectors of the form (x, y, z).
- For your business code (application layer), you will want to use
Eigen::Vector3f
orEigen::Vector3d
in C++ andnp.ndarray
in Python.
These types are strong, as they use features of their respective programming language to provide efficient and convenient usage (think of mathematical operations, slicing , etc). In addition, they are widely used and well-supported by other libraries. They are what we call business objects (BO).
- The transport layer only cares about how your data types are structured, but not what they mean:
It has no idea (and does not care) that two 3D vectors can be multiplied to return a scalar number. All it needs to do is transport the data from one point to the other. Also, the transport layer has to support different programming languages. Therefore, these data transfer objects (DTO) are usually very limited in their functionality (they are mere data containers). As a consequence, you do not want to use the DTOs in your business code apart from very simple cases.
When you are designing your distributed application, it is good to think about what part of it is application/business code (usually the majority), and where it touches the transport layer. Ideally, the core of your business code (e.g. the math) can also be used in a non-distributed way (as a library), and the code connecting it to the transport layer is clearly separated from that core.
In principle, you can imagine the layer diagram of such a software architecture like this:
Now, back to the actual topic!
The second paradigm is a publisher-subscriber communication. Other terms for the same concept are "producer-consumer", "provider-listener", and "signal-slot". In principle, topics realize an observer pattern: A topic is like a broadcasting channel:
In contrast to a server in the client-server paradigm, a topic is not an own process. Conceptually, it is just a name and an associated topic interface. The interface defines what messages can be sent over the topic. Topic interfaces differ in one important aspects from a service interface: They never return anything. This is because communication via topics is unidirectional: A publisher sends away a message, but does not get a response directly. Therefore, the corresponding interface functions always have a void
return type:
Again, let us note some important properties of topic-based communication:
Let us summarize and compare communication via RPCs and topics again:
Remote Procedure Calls (RPC, Server-Client) | Topics (Publisher-Subscriber) |
---|---|
1 server to n clients (1-to-many) | m publishers to n subscribers (many-to-many) |
Bidirectional (client to server to client) | Unidirectional (publishers to subscribers) |
Synchronous (by default) (client waits for response) | Asynchronous |
Some Coupling: Server is stand-alone, but client depends on server | Loose coupling: Publishers and subscribers are independent |
Similar to regular function call | Similar to observer pattern |
To realize the required interprocess-communication, ArmarX relies on the RPC framework Ice by ZeroC (Ice also supports topics). Ice can exchange data between many programming languages. Therefore, it requires data types and interfaces to be defined in language that is independent of any concrete programming language. In general, this language is called the Interface Definition Language (IDL).
In Ice, this language is called Slice (Specification Language for Ice). The examples above are all written in Slice. Syntactically, it looks similar to C++, but of course they are very different. Slice code is usually stored as .ice
files. You should give that page a brief read - it is not long, but it spares us from repeating everything here. However, for the sake of self-containment, we reproduce the most important statement in this context here:
Slice definitions are compiled for a particular implementation language by a compiler. The language-specific Slice compiler translates the language-independent Slice definitions into language-specific type definitions and APIs. These types and APIs are used by the developer to provide application functionality and to interact with Ice. The translation algorithms for various implementation languages are known as language mappings, and Ice provides a number of language mappings (for C++, C#, Java, JavaScript, Python and more).
Note that ArmarX itself mainly makes use of the C++ and Python mappings.
In the next tutorials, we will see how both remote procedure calls and topics can be implemented in ArmarX: