Eight things programming languages do
Part Two: Things 6 -8
In part two we continue to look at what programming languages have in common.
6. Arrays and lists
In part one we saw that variables (”labeled boxes”) can be assigned various values, such as numbers and strings of characters. You may have thought that there is a one-to-one relationship between variables and values, that is one variable holds one value. But you can also make a variable hold many values, provided those values are arranged in an array or list.
As the name suggests, an array is a collection of values that have a specific order. For our purposes, a list is the same thing and I use the terms interchangeably in this article.
Let’s have a look at my shopping list this week - it’s a bit sparse, consisting of:
My shopping list
apples
beer
coffee
OK, OK - can I help it if I keep my shopping list in alphabetical order? But it helps to illustrate my point - my shopping list could be stored in an array and the specific order that I prefer (nice, neat alphabetical) will be retained.
If we’re to computerise a shopping list, we will need to find a way of getting things into and out of our list.
Listing 1 shows my shopping list in our five sample languages. First the list is created, then each item in the list is retrieved and printed one at a time. All of these programs do the same thing - output my shopping list as shown above.
Listing 1
Java
ArrayList shopping = new ArrayList();
shopping.add( "apples" );
shopping.add( "beer" );
shopping.add( "coffee" );
System.out.println( shopping.get(0) );
System.out.println( shopping.get(1) );
System.out.println( shopping.get(2) );
Perl
@shopping = ("apples", "beer", "coffee");
print "$shopping[0]\n";
print "$shopping[1]\n";
print "$shopping[2]\n";
Python
shopping = ["apples", "beer", "coffee"]
print shopping[0]
print shopping[1]
print shopping[2]
REBOL
shopping: ["apples" "beer" "coffee"]
print shopping/1
print shopping/2
print shopping/3
Scheme
(define shopping '("apples" "beer" "coffee"))
(display (car shopping))
(newline)
(display (car (cdr shopping)))
(newline)
(display (car (cdr (cdr shopping))))
(newline)
All except Java can create a list in one line - the values in the list are grouped together inside brackets (sometimes square brackets [] and sometimes round ()) and there is some kind of separator between each of the values - Perl and Python use a comma, REBOL and Scheme use a space. This array of values is then assigned to the shopping variable as a single unit.
Java is a little different, requiring that we declare the variable shopping to be an ArrayList type then each item is added one by one. A little bit more work, but the result is the same - a variable holding the three things on my shopping list.
When it comes to getting things out of the list, Scheme is the odd-one-out and we’ll come back to that example shortly. All of the others use an index to say which item should be retrieved. Remember that lists have a specific order, so if you give your program the instruction that means “get the second item from the shopping list” it will produce the same value every time.
The index is the way that we say which item we want. The usual way to do this is to put the index number beside the variable name. Perl and Python require brackets around the index and REBOL puts the index number at the end after a slash. Java makes it very clear what is going on by adding the .get( index ) instruction to the end of the variable.
There’s a few things to note about index values:
Programming languages usually start counting at zero - so the first item in the list will be index 0, the second will be index 1, and so on. Zero-based indexing is by far the most common technique but just to prove that it isn’t always so REBOL starts counting at 1 (which is a little more people-friendly).
Just because you can add an index to the end of a variable doesn’t mean it will work. If the variable doesn’t hold a list then adding an index will either produce an error or wildly unexpected results.
One of the most common programming mistakes is to use an index that is out of range. If any of the examples in Listing 1 asked for index 10, an error would occur because only the first three index positions are defined. For the same reason, negative index values do not work in most programming languages.
Now let’s have a look at that troublesome Scheme example. To understand what’s going on here we need to become familiar with the two Scheme instructions car and cdr (pronounced “could-r”). The car instruction simply takes the first value off a list, so (car shopping) produces the first value in the list - “apples”. That’s what gets printed in the first display command.
In the second display command we find cdr which does the exact reverse of car - it produces the list that remains when you take off the first value. So (cdr shopping) produces the list (”beer” “coffee”). If we then take the car of this list (car (cdr shopping)) we get the value “beer”.
Finally in the last display line, we already know that (cdr shopping) produces list (”beer” “coffee”), so (cdr (cdr shopping)) must produce (”coffee”) - that is the list that remains when we take the first value off the list (”beer” “coffee”). Note that (”coffee”) is a list - it only has one element in it but it’s still a list. We need to use car to get the first (and only) value in this list (car (cdr (cdr shopping))) which produces “coffee”. And the job is done!
An aside
Question: If (car (cdr (cdr shopping))) produces the value “coffee”, what does (cdr (cdr (cdr shopping))) produce?
Answer: () - a list with nothing in it!
7. Functions
Function, procedure, subroutine, and subprogram are all words that mean pretty much the same thing - a batch of programming instructions that can be used repeatedly. Give the function a label, and whenever that label appears in the program the instructions contained in the function will be retrieved and used at that point in the program. This is known as calling a function.
They are particularly useful because different information can be sent to a function each time it is called, which means that the same function can be made to do different things depending on the information it is sent. The individual items of information sent or passed to a function are its parameters. Within the function, the values of the parameters are stored in variables typically referred to as arguments.
Listing 2 puts this into practice. Each example defines a function called printShoppingList, which has one argument list that expects to receive an array when the function is called. The first three elements of the list are then printed. When I pass shopping (the array holding my shopping list) to the printShoppingList function my shopping list is printed.
Listing 2
Java
ArrayList shopping = new ArrayList();
shopping.add("apples");
shopping.add("beer");
shopping.add("coffee");
printShoppingList( shopping );
public static void printShoppingList( ArrayList list ) {
System.out.println( list.get(0) );
System.out.println( list.get(1) );
System.out.println( list.get(2) );
}
Perl
sub printShoppingList {
@list = @_;
print "$list[0]\n";
print "$list[1]\n";
print "$list[2]\n";
}
@shopping = ("apples", "beer", "coffee");
printShoppingList( @shopping );
Python
def printShoppingList( list ):
print list[0]
print list[1]
print list[2]
shopping = ["apples", "beer", "coffee"]
printShoppingList( shopping )
REBOL
printShoppingList: func [ list ] [
print list/1
print list/2
print list/3
]
shopping: ["apples" "beer" "coffee"]
printShoppingList shopping
Scheme
(define shopping '("apples" "beer" "coffee"))
(define printShoppingList
(lambda (list)
(display (car list))
(newline)
(display (car (cdr list)))
(newline)
(display (car (cdr (cdr list))))
(newline))
)
(printShoppingList shopping)
Let’s examine how that works. You’ll notice that all I’ve done here is alter Listing 1 so that the output commands are in a separate function, and where those output instructions once appeared is now a call to the printShoppingList function. It is also hard to miss that in all examples except Java, the function definition appears before the main body of the program. That is the function is defined before it is called in the program. There is no problem doing this, in fact some languages require it.
If we penetrate the differences in the languages, we can see that in essence Java, Perl, and Python define functions in a very similar way. First there is a keyword that indicates we are about to make a function:
Java: public static void
Perl: sub
Python: def
Then the name of the function: printShoppingList is all cases. Next is the name of the argument list in brackets. (Actually Perl skips this - assuming you will probably want to pass information Perl gives you the @_ argument automatically in every function.) Finally, the code that makes up the function is included - indented for Python and enclosed in curly brackets for Perl and Java.
Scheme and REBOL take a slightly different approach. You might have noticed that both languages use the same notation as is used for defining a variable, because that is essentially what is happening in these languages - the value of the printShoppingList “variable” is the function printShoppingList! So, they declare a variable:
REBOL: printShoppingList:
Scheme: define printShoppingList
Then use the keyword that indicates a function is being made:
REBOL: func
Scheme: lambda
And after that it’s pretty much the same as for the other languages - include the argument list followed by the code body of the function.
Of course, the output from Listing 2 is exactly the same as for Listing 1.
8. Repetition
There is a fundamental problem with all the examples so far - they are OK this week when I have only three items on my meagre shopping list but what about next week when I might have 5 or 10 items? (Or even more if I’m feeling particularly prosperous.) As things stand my programs will only ever print the first three items on my list, no matter how many things I actually need to buy.
What I want is a way of printing everything on my list without having to know beforehand how many things will need to be printed. Ideally I need to apply a print command repeatedly to every item in shopping list, no matter how long it is. This is where repetition comes in - a very handy feature available in all good programming languages.
Repetition by recursion
What would happen if I changed my printShoppingList function so that it did the following things? Firstly, check that there is anything in list. If list is empty do nothing otherwise do all of the following in order:
print the first item in list
remove the first item from list
call printShoppingList again passing it the newly shortened list as the parameter
Listing 3 shows what I mean.
Listing 3
Java
ArrayList shopping = new ArrayList();
shopping.add("apples");
shopping.add("beer");
shopping.add("coffee");
printShoppingList( shopping );
public static void printShoppingList( ArrayList list ) {
if ( !list.isEmpty() ) {
System.out.println( list.get(0) );
list.remove(0);
printShoppingList( list );
}
}
Perl
sub printShoppingList {
@list = @_;
if (@list > 0) {
print "$list[0]\n";
printShoppingList( @list[1 .. $#list] );
}
}
@shopping = ("apples", "beer", "coffee");
printShoppingList( @shopping );
Python
def printShoppingList( list ):
if len(list) > 0:
print list[0]
printShoppingList( list[1:])
shopping = ["apples", "beer", "coffee"]
printShoppingList( shopping )
REBOL
printShoppingList: func [ list ] [
if not tail? list [
print list/1
printShoppingList next list
]
]
shopping: ["apples" "beer" "coffee"]
printShoppingList shopping
Scheme
(define shopping '("apples" "beer" "coffee"))
(define printShoppingList
(lambda (list)
(if (null? list)
()
(begin
(display (car list))
(newline)
(printShoppingList (cdr list)))
)
)
)
(printShoppingList shopping)
Can you see that it is possible for printShoppingList to call itself again before it has finished? Surely if we do that the program go on for ever as printShoppingList constantly keeps calling itself over and over!
We avoid this so-called “infinite loop” because each time printShoppingList calls itself, it passes a shorter copy of list to the next version of printShoppingList. Eventually this constantly shortening list will have nothing in it, and when printShoppingList is passed an empty list it does nothing and stops.
When we look at the successive calls to printShoppingList we can see that there are four progressively smaller versions of list each time:
list 1 = (”apples”, “beer”, “coffee”) list 2 = (”beer”, “coffee”) list 3 = (”coffee”) list 4 = ()
In each case in Listing 3, the first thing that the printShoppingList function does is check if the list argument is empty because an empty list is the end point of the function. To put this another way, once the list is empty there are no more items to be printed to so no more processing is required. Other people will try to tell you that this process, called recursion, is frightfully complicated but it’s not - all you need to do is make sure the function checks for its end point first and make sure that when the function calls itself it moves one step closer to reaching that end point.
Repetition by iteration
Now it wouldn’t be computer programming if there wasn’t another way to do something important like repetition. And there is - the process of iteration provides a block of code that is executed for each item in a list. Iteration is sometimes called looping because the program “loops” over the same section of code several times.
Listing 4 illustrates.
Listing 4
Java
ArrayList shopping = new ArrayList();
shopping.add("apples");
shopping.add("beer");
shopping.add("coffee");
for (int i = 0; i < shopping.size(); i++)
{
System.out.println(shopping.get(i));
}
Perl
@shopping = ("apples", "beer", "coffee");
foreach $item (@shopping) {
print "$item\n";
}
Python
shopping = ["apples", "beer", "coffee"]
for item in shopping:
print item
REBOL
shopping: ["apples" "beer" "coffee"]
foreach item shopping [print item]
Scheme
(define shopping '("apples" "beer" "coffee"))
(define displayn (lambda (n) (display n) (newline)))
(for-each displayn shopping)
There are two variations on the theme of iteration here. All of the examples except Java use the for each loop, which simply means what it says: “for each item in the list do this block of code”. Conveniently, the block of code just happens to include the command to print item.
Java provides what is sometimes called a plain for loop, which requires a bit more attention. A for loop needs to know the length of the list it is to work on - fortunately Java can provide this by calling the .size() function on the shopping list. It is then a simple matter of counting from 0 to whatever the size is and looping through the block of code each time, increment the counter i each time. Inside the block we use standard array notation to retrieve each item in turn.
…6,7,8 - That’s all.
In part one of this article we looked at the first five things: output, variables, expressions, input, and selection. Now we have added three more advanced things: arrays, functions, and repetition. To summarise:
Arrays allow a list of values to be assigned to one variable as a single unit. Most languages use an index value to retrieve specific items in an array.
Functions are groupings of code that can be called repeatedly. Parameter values can be passed to a function each time it is called, allowing the function to produce different outcomes each time.
Repetition is an important tool for solving many programming problems. There are two common approaches to repetition - recursion and iteration.
Of course this article has barely scratched the surface of the eight things (and more) that programming languages can do but this should give you a good toe-hold in learning to program or learning a new programming language.
Further reading
Readers interested in a more detailed examination of differences and similarities between programming languages should have a look at syntax across languages.
Examples from this article are available for download from this web site.
tech.thingoid Gosbell First published: PC Update May 2004 (online version updated)
October 22nd, 2005 at 5:34 am
For Perl arrays,
@shopping = qw/apples beer coffee/;
is cleaner
May 21st, 2006 at 1:41 pm
And for rebol, [ ‘apples ‘beer ‘coffee ] is also cleaner as well.