A data type is a collection of related values.
These collections
need not be disjoint, and they are often hierarchical.
Scheme has a rich set of data types: some are simple
(indivisible) data types and others are compound data types
made by combining other data types.
The last expression illustrates a Scheme convenience:
In a context that requires a boolean, Scheme will treat
any value that is not #f as a true value.
Scheme numbers can be integers (eg, 42), rationals
(22/7), reals (3.1416), or complex (2+3i). An
integer is a rational is a real is a complex number is a
number. Predicates exist for testing the various kinds of
numberness:
Scheme integers need not be specified in decimal (base 10)
format. They can be specified in binary by prefixing the
numeral with #b. Thus #b1100 is the number twelve.
The octal prefix is #o and the hex prefix is
#x. (The optional decimal prefix is #d.)
Numbers can tested for equality using the general-purpose
equality predicate eqv?.
(eqv?4242) =>#t
(eqv?42#f) =>#f
(eqv?4242.0) =>#f
However, if you know that the arguments to be compared are
numbers, the special number-equality predicate = is more
apt.
(=4242) =>#t
(=42#f) -->ERROR!!!
(=4242.0) =>#t
Other number comparisons allowed are
<, <=, >, >=.
(<32) =>#f
(>=4.53) =>#t
Arithmetic procedures +, -, *, /, expt have the
expected behavior:
For a single argument, - and / return the negation
and the reciprocal respectively:
(-4) =>-4
(/4) =>1/4
The procedures max and min return the maximum and
minimum respectively of the number arguments supplied to
them. Any number of arguments can be so supplied.
(max13423) =>4
(min13423) =>1
The procedure abs returns the absolute value of
its argument.
(abs3) =>3
(abs-4) =>4
This is just the tip of the iceberg. Scheme
provides a large and comprehensive suite of arithmetic
and trigonometric procedures. For instance, atan,
exp, and sqrt respectively return the
arctangent, natural antilogarithm, and
square root of their argument. Consult
R5RS [22] for more details.
Scheme character data are represented by prefixing the
character with #\. Thus, #\c is the character
c. Some non-graphic characters have more descriptive
names, eg, #\newline, #\tab. The character for
space can be written #\ , or more readably, #\space.
The character predicate is char?:
(char?#\c) =>#t
(char?1) =>#f
(char?#\;) =>#t
Note that a semicolon character datum does not trigger
a comment.
The character data type has its set of comparison
predicates: char=?, char<?, char<=?, char>?,
char>=?.
The simple data types we saw above are self-evaluating. Ie, if you typed any object from these
data types to the listener, the evaluated result returned by
the listener will be the same as what you typed in.
#t=>#t42=>42#\c=>#\c
Symbols don't behave the same way. This is because symbols
are used by Scheme programs as identifiers for variables, and thus will evaluate to the value that the
variable holds. Nevertheless, symbols are a simple data
type, and symbols are legitimate values that Scheme can
traffic in, along with characters, numbers, and the rest.
To specify a symbol without making Scheme think it is a
variable, you should quote the symbol:
(quotexyz)
=>xyz
Since this type of quoting is very common in Scheme, a
convenient abbreviation is provided. The expression
'E
will be treated by Scheme as equivalent to
(quoteE)
Scheme symbols are named by a sequence of characters. About
the only limitation on a symbol's name is that it shouldn't
be mistakable for some other data, eg, characters or booleans
or numbers or compound data. Thus, this-is-a-symbol,
i18n,
<=>, and $!#* are all symbols; 16, -i (a
complex number!), #t, "this-is-a-string", and
(barf) (a list) are not. The predicate for
checking symbolness is called symbol?:
(symbol?'xyz) =>#t
(symbol?42) =>#f
Scheme symbols are normally case-insensitive. Thus the
symbols
Calorie and calorie are identical:
(eqv?'Calorie'calorie)
=>#t
We can use the symbol xyz as a global variable by using
the form define:
(definexyz9)
This says the variable xyz holds the value 9. If we
feed xyz to the listener, the result will be the value
held by xyz:
xyz=>9
We can use the form set! to change the value held by a
variable:
Strings are sequences of characters (not to be confused with
symbols, which are simple data that have a sequence of
characters as their name). You can specify strings by
enclosing the constituent characters in double-quotes.
Strings evaluate to themselves.
"Hello, World!"=>"Hello, World!"
The procedure string takes a bunch of characters and
returns the string made from them:
(string#\h#\e#\l#\l#\o)
=>"hello"
Let us now define a global variable greeting.
(definegreeting"Hello; Hello!")
Note that a semicolon inside a string datum does not
trigger a comment.
The characters in a given string can be individually
accessed and modified. The procedure string-ref takes a
string and a (0-based) index, and returns the character at
that index:
(string-refgreeting0)
=>#\H
New strings can be created by appending other strings:
You can make a string of a specified length, and fill it
with the desired characters later.
(definea-3-char-long-string (make-string3))
The predicate for checking stringness is string?.
Strings obtained as a result of calls to string,
make-string, and string-append are mutable.
The procedure string-set! replaces the
character at a given index:
Vectors are sequences like strings, but their elements can
be anything, not just characters. Indeed, the elements can
be vectors themselves, which is a good way to generate
multidimensional vectors.
Here's a way to create a vector of the first five integers:
(vector01234)
=>#(01234)
Note Scheme's representation of a vector value: a #
character followed by the vector's contents enclosed in
parentheses.
In analogy with make-string, the procedure
make-vector makes a vector of a specific length:
(definev (make-vector5))
The procedures vector-ref and vector-set! access and
modify vector elements.
The predicate for checking if something is a vector is vector?.
A dotted pair is a compound value made by combining
any two arbitrary values into an ordered couple. The
first element is called the car, the second
element is called the cdr, and the combining
procedure is cons.
(cons1#t)
=> (1 . #t)
Dotted pairs are not self-evaluating, and so to specify
them directly as data (ie, without producing them via
a cons-call), one must explicitly quote them:
'(1 . #t) => (1 . #t)
(1 . #t) -->ERROR!!!
The accessor procedures are car and cdr:
(definex (cons1#t))
(carx)
=>1
(cdrx)
=>#t
The elements of a dotted pair can be replaced by the
mutator procedures set-car! and set-cdr!:
(set-car!x2)
(set-cdr!x#f)
x=> (2 . #f)
Dotted pairs can contain other dotted pairs.
(definey (cons (cons12) 3))
y=> ((1 . 2) . 3)
The car of the car of this list is 1.
The cdr of the car of this list is 2.
Ie,
(car (cary))
=>1
(cdr (cary))
=>2
Scheme provides procedure abbreviations for cascaded
compositions of the car and cdr procedures.
Thus, caar stands for ``car of car of'',
and cdar stands for ``cdr of car of'', etc.
(caary)
=>1
(cdary)
=>2
c...r-style abbreviations for upto four cascades are
guaranteed to exist. Thus, cadr, cdadr, and
cdaddr are all valid. cdadadr might be pushing it.
When nested dotting occurs along the second element,
Scheme uses a special notation to represent the
resulting expression:
(cons1 (cons2 (cons3 (cons45))))
=> (1234 . 5)
Ie, (1234 . 5) is an abbreviation for (1
. (2 . (3 . (4 . 5)))). The last cdr of this
expression is 5.
Scheme provides a further abbreviation if the last cdr
is a special object called the empty list, which
is represented by the expression (). The empty
list is not considered self-evaluating, and so one
should quote it when supplying it as a value in a
program:
'() => ()
The abbreviation for a dotted pair of the form (1
. (2 . (3 . (4 . ())))) is
(1234)
This special kind of nested dotted pair is called a
list. This particular list is four elements
long. It could have been created by saying
(cons1 (cons2 (cons3 (cons4'()))))
but Scheme provides a procedure called list that
makes list creation more convenient. list takes
any number of arguments and returns the list containing
them:
(list1234)
=> (1234)
Indeed, if we know all the elements of a list, we can use
quote to specify the list:
Scheme offers many procedures for converting among
the data types. We already know how to convert between
the character cases using char-downcase and
char-upcase. Characters can be converted into
integers using char->integer, and integers can be
converted into characters using integer->char.
(The integer corresponding to a character is usually
its ascii code.)
(char->integer#\d) =>100
(integer->char50) =>#\2
Strings can be converted into the corresponding list of
characters.
(string->list"hello") => (#\h#\e#\l#\l#\o)
Other conversion procedures in the same vein are
list->string, vector->list, and
list->vector.
Numbers can be converted to strings:
(number->string16) =>"16"
Strings can be converted to numbers. If the string
corresponds to no number, #f is returned.
(string->number"16")
=>16
(string->number"Am I a hot number?")
=>#f
string->number takes an optional second argument,
the radix.
(string->number"16"8) =>14
because 16 in base 8 is the number fourteen.
Symbols can be converted to strings, and vice versa:
Scheme contains some other data types. One is the
procedure. We have already seen many procedures, eg,
display, +, cons. In reality, these are
variables holding the procedure values, which are
themselves not visible as are numbers or characters:
cons=><procedure>
The procedures we have seen thus far are primitive
procedures, with standard global variables holding them.
Users can create additional procedure values.
Yet another data type is the port. A port is the
conduit through which input and output is performed.
Ports are usually associated with files and consoles.
In our ``Hello, World!'' program, we used the
procedure display to write a string to the console.
display can take two arguments, one the value to be
displayed, and the other the output port it should be
displayed on.
In our program, display's second argument was
implicit. The default output port used is the standard
output port. We can get the current standard output
port via the procedure-call (current-output-port).
We could have been more explicit and written
All the data types discussed here can be lumped
together into a single all-encompassing data type
called the s-expression (s for symbolic). Thus 42, #\c, (1 . 2), #(abc), "Hello", (quotexyz),
(string->number"16"), and (begin
(display"Hello, World!") (newline)) are all s-expressions.