Sunday, April 24, 2016

Modelling the atom


Introduction

We shall discuss an example of representing some high-school science statements about electron layers. Of course this is not directly about metaprogramming, but it is a fairly well known subject, a fun subject (in my opinion) and the questions asked could have analogues in metaprogramming.
So let us start with a natural language description of the "knowledge" we are trying to represent and then we comment as we go along what representational issues might arise. Please do not pick on the scientific accuracy of the statements below.

The statements

  1. There is a finite set called the chemical elements
  2. Each chemical element has a unique name (which is a string).
  3. There is a certain set of physical objects called atoms.
  4. Every atom belongs to a unique chemical element.
  5. An atom can be in a ionized or non-ionized state.
  6.  For each chemical element, any non-ionized atom that belongs to it has a finite set of electrons whose cardinality (count) is a function of the chemical element. This number is called the atomic number. 
  7. Below we only deal with non-ionized atoms.
  8. The set of electrons in an atom is partitionned into subsets called layers.
  9. One of the layer is called the base layer.
  10. Each layer has a different energy level, a non-negative real number measured in electron volts.
  11. The base layer has an energy level of zero. 
  12. Sorting layers by energy level is the way we will adopt to index them. The first index is 1.
  13. If an electron is a member of a layer, its energy level is the energy level of the layer.
  14. Each layer has a number called a capacity. 
  15. The capcity is independent of the chemical element to which the atom belongs. 
  16. No layer can contain more electrons than its capacity.
  17. The capacity is in fact equal to 2n² where n is the index of the layer. 
  18. A layer is said to be filled if it has as many electrons as its capacity. 
  19. If a layer of index n has an electron, then all layers of lower index are filled.

Where the statements might be used

We can imagine that something with a structure like the list of above statements is given in a high school comprehension test. Then the student has to answer questions. For example:

  • An atom a belongs to a chemical element called Chlorine, whereas an atom b belongs to an element called Tin. Could they have the same number of electrons?
  • An atom has 60 electrons. Does the layer of index 2 have any electrons? If so, how many?


Replacing the terms by unfamiliar ones and the effect on legibility

Suppose we change much of the terminology above to some that are not part of the usual vocabulary: we replace "chemical elements" by "fargizles", atoms by "meditrons", layers by "compartments" and so on. Then all of a sudden we might notice that some things are not at all clear in the above description. It's an experiment worth carrying out.
It's a bit like the idea of renaming the identifiers in the source code of a program. All of a sudden it becomes extremely hard to read. This analogy does not go very far though.

Dependencies, order and permutation

Another interesting aspect of the statements is that there are obvious dependencies between them; line 4 depends on line 1 to make sense, for example. This raises a reasonable general question:
If we permute the order of the above statements can we necessarily make sense of the set of statements, and reconstruct the dependencies? I'm not sure this is obvious, but when I was a student, our teacher on expert systems insisted on the fact that the "rules" in an expert system were not ordered. If permutation of a large set of statements can make the reconstruction of the dependencies unworkable is this a defect that should be got rid of, or it just something we have to live with?

Non-monotonic reasoning

Here's another point. Anybody who reads the above makes inferences while reading the statements, and we may ask if some of the inferences are going to be invalidated and reconsidered. This is the problem of non-monotonic reasoning.

Attempts at formalisation and pitfalls

Formalisation evokes the choice of data structures or the use of well-established representational relations like "part-of", "instance-of". These are representational commitments, and I think I should stress that I think (paraphrasing Knuth) that premature formalisation is the root of all evil. At the very least we should not discard the original natural language statements.

The set of electrons

Concerning statement 6 above: every atom has a finite number of electrons. We have some formalisation concepts to bear on this, namely that electrons are part-of atoms. If electrons are part-of atoms, and there are a finite number of them, one might be tempted to create an atom object with a list-of-electrons field, but this turns out to be completely wrong here, for several reasons:
One of which is that electrons are not named, and we have no need for such a list in the reasoning.

The collection of layers

On the other hand having a list of the "layers" might seem ok, but the readed might wonder what the ordering of the layers would mean. It is not immediately obvious that it has any ordering at all from 8, but when we come to 10, we can deduce that if each layer has a different energy level, then the energy level can be used to order the layers.
Finally this is done at statetement 12.At this point it is like the person who has written the statements down has imposed "the" ordering of the layers.

 It could be thought that some of the statements like 8 are therefore redundant and should be eliminated, I retort that it would add cognitive strain to the person entering the ordered set of statements for one, and that it should not cost anything in the long run anyway, and there might even be some less obvious advantages because it might make it easier to revise the sequence of statements.

The rules of filling have representational consequences

The fact that layers are filled with electrons from the lowest level upwards makes it simpler for us to think about the state of the collection of layers. In particular, only the number of electrons needs to be stated -from there we can instantly tell how many electrons there are in each layer. In traditional formalisation practice this is something that is left entirely up to the designer of the data structures. I want to suggest that leaving it entirely up to the human is probably not a good idea in the long run.

A next step

  The next step would be to build a system and try to teach it the above statements, with a formalisation that tries to stick as closely as possible to the above statements. Then we could add further statements (for example concerning the emission of light when electrons change layers) and we could see if it would be hard to extend the system, or would it break down when trying to extend it. 
I can think of two kinds of extensions:
  • add new scientific information (which might be unknown to a high school student)
  • try to make and replace the above statements by richer and more realistic ones, using "common knowledge": like talking about the shape and size of an atom.