Sorting algorithms

This is a refresher of sorting algorithms since I recently realized that I don’t remember how a lot of the most common sorting algorithms work. I’m only going to focus on arrays on this article since it is the most common structure for these kind of problems.

Bubble sort

This is the first algorithm we learn at school. It is not very efficient(O(n ^ 2) in most of the cases) but it is pretty easy to implement.

  • Grab the first two elements(0 and 1) in an array and compare them
  • If the element in the left is higher than the elements in the right then swap them
  • Grab the next two elements(1 and 2) and do the same
  • Repeat until the greatest element is in the far right
  • Do the same starting from the first two elements(0 and 1) but ending before reaching the last element(which is already sorted)
  • At the end the array will be sorted

Read more »

Stack and heap memory in C++

Since I’ve been working in C++, I’ve noticed the necessity to get familiar with concepts about computer architecture and how the operating system works. One thing that has been bothering me for a few weeks is hearing people talk about the heap and stack memory, while I don’t really know what those things mean.

I decided to write this post to try to explain to myself what is the difference between stack memory and heap memory and hopefully understand why people keep talking about them.

The names

Stack and Heap are both names of data structures. This can lead to some confusion, because although the stack is called like that because it works similar to a stack (I’ll explain more about this), the heap is not related at all to the heap data structure. The exact origin of the heap word in this context is unknown to me, but from a little research it seems like it’s use is based on the English language definition for that word:

1
An untidy collection of things piled up haphazardly

Which actually describes pretty well what a heap of memory is.

Read more »

Writing tests for C++ code

The time has come for me to start writing some tests for my C++ code, and I have to admit I’m a little nervous. The company I’m working for uses Google Test as their test framework, so I will trust their expertise and use it too.

Set up

Lets start by creating a folder for our project:

1
2
mkdir ~/project
cd ~/project/

The next step is to download and unzip google test:

1
2
3
wget https://github.com/google/googletest/archive/release-1.8.0.tar.gz
tar -zxf release-1.8.0.tar.gz
rm release-1.8.0.tar.gz

Read more »

Introduction to GDB

GDB is the GNU project debugger. It can be used to see what a program is doing or what it was doing when it crashed. GDB can be used with a variety of languages. Because I’m learning C++, I’m going to explain it in the context of C++.

Adding debugging symbols

One of the stages of the compilation of a C++ program is to generate an object file (file.o). This object file contains what is called a symbol table, which contains each identifier in the code with information associated with it (type, constness, etc…).

If we want to be able to use GDB in one of our programs we need to add debugging information to this table (debug symbols). To add debug symbols to our binary we use the -g flag:

Read more »

Futures, async, packaged_tasks and promises in C++

If you are unfamiliar with threads in C++ I recommend you take a look at my article about threads before reading this one.

Futures

Futures are a concept of concurrent programming. A future is a way to access a result from an asynchronous operation. As a simple example:

1
2
std::future<int> fut = functionThatReturnsFuture();
int val = fut.get();

The code above should be very easy to read. We execute a function and it returns a future. Then we use this future to get a value. The interesting thing about this code is that functionThatReturnsFuture can be (and most likely is) an asynchronous operation. When we call fut.get(), our code will wait for that asynchronous operation to complete. When the operation completes, it will return an int value that will then be assigned to val.

Read more »

Introduction to C++ threads

I’m getting started with concurrency in C++ and threads seem to be a good way to get familiar with the basics.

Processes

When a user starts executing a program, the program becomes a process. A process is a set of instructions (the code) and state (memory, registers) that is managed by the operating system. The operating system tells the processor which process it should be running.

In a system with a single core, only one process can be executed at one point in time. Since there are a lot of processes running in a modern system, the operating system will take care of deciding which processes should be serviced by the CPU.

Processes can create other processes by using Fork. When a process forks, they become independent of each other. They own their own instructions and state.

Read more »

Autotools

A few days ago I was listening to a talk about the new features of C++ and I heard the presenter mention autools. I felt pretty dumb not knowing what he was talking about so I’m writing this post to make me feel less dumb.

Make

It all started with Stuart Feldman’s make. Make is a tool that generates files based on other files. In a makefile you can specify a list of files you want to generate and how to generate those files (based on other files). The most common use for make is to generate executables based on source code.

Make allows people to generate executables easily based on source code. People in possession of the code don’t need to know the steps to build the executable because these steps are already recorded in the makefile. Another advantage of make is that it allows for faster builds by keeping tracks of source files that haven’t changed since last time a build was run and skipping unnecessary steps. This is specially useful for large codebases that take long time to compile.

Read more »

C++ types, pointers and references

Writing a post about types in most high level languages I’m used to wouldn’t be very interesting, but I’ve recently started learning C++ and I realized that I need to understand memory a little better to be able to write and read C++ programs effectively.

There is a collection of types that should not cause many surprises. .

Integer types

char 1 byte
short 2 bytes
int 4 bytes
long 8 bytes
long long 16 bytes

Floating point types

float 14 bytes
double 18 bytes
double double 116 bytes

*The sizes above are for GNU C compiler, but might vary for different compilers
*All numeric types can be modified with the unsigned keyword. This affects the minimum and maximum possible values that can be hold and the way arithmetic operations work on them, but not their size in bytes.

Read more »

C++ Header filesĀ 

I’m writing C++ in the title of this article because I’m currently in a journey to learn C++. I believe the same concepts apply to C.

Writing about C++ is a little harder than writing about other languages because I keep stumbling into circular references where I need to understand A in order to understand B, but it’s very hard to understand A without understanding B.

I’m going to try to start with this article where I’ll explain why C++ has header files (files with .h extension) and how to use them.

Code separation

Before we start looking into header files, lets first look at how code is split and included in languages where there are no header files. This little example is in node.js:

Read more »

Introduction to etcd

In previous posts I wrote a little about distributed systems and the Raft algorithm. Today I’m going to look at one distributed key-value store that uses the Raft algorithm to achieve consistency and high availability.

From a client’s perspective, etcd will behave like any other key value store out there. It’s use of Raft underneath will make sure that there is only one leader at a given time and that the log is replicated to all nodes.

Getting ready

For this exercise I’m going to create a 5-node cluster, but before we start there are a few things we need to decide.

By default each etcd nodes uses port 2380 for communicating with clients and port 2379 for server to server communication. We will keep this default behavior.

Each node in the cluster needs to be able to communicate with the rest of the nodes in the cluster. The number of nodes in the cluster and their location needs to be configured for the cluster to be able to do some work.

In normal conditions we would have each node in a different host with a different IP Address. This would allow us to say something like: You can find node A at 10.10.10.2.

Running the cluster in a single machine makes things challenging because they would all be sharing the same IP address. To walk around this issue, we will create our own docker network and work within this network.

Read more »