Perceived Order: The Learning of Rules

This is a draft of Chapter 5, in the book outline

The Learning of Rules

We have established that living things (LTs) can live much better in many circumstances if they cooperate with each other to exploit large or difficult resource patterns (RPs). We have seen that this cooperation is accomplished through rules which govern the behavior of LTs. Those rules, in the early example with critters, were given to the critters by us human modelers. But that is a big and arbitrary assumption, that we can give rules to critters.

What if LTs do not have us modelers, or some other superior and overlooking intelligence, to give them rules? The higher ambition for us modelers is to craft LTs which can discover their own rules. Ultimately we would like to show how LTs of our creation can survive by finding whatever RPs happen to exist in any universe they inhabit. We want them to be survivors, able carry on without our oversight. If we modelers can achieve that, then we should have gained valuable insight into our own psychological existence.

This turns out to be a large and complex subject. In this chapter we will review some general assertions and we will look briefly in a few promising directions. But we will not develop complete answers to any of these questions; the area is still too big and unexplored.

1. Complications of learning rules

When we start to explore how the LTs in our models will learn the needed rules, many complications arise. Our quest becomes clouded, or sometimes divides into distinct programs, with each new complication that we encounter. In this first section of the chapter we will review eighteen of these complications.

1.1 Care and conscious attention must be exercised when giving new abilities to LTs

We human modelers need to be mindful of what powers we are giving, or assuming to exist, in the minds of our LTs. This is because we can gain insight when we acknowledge that critters from a given cast of development simply can not see or know things which are obvious to us humans. We can also gain insight when we discover that the gift of a simple mechanical rule can empower a population of critters to accomplish coordination which we would have thought requires significant intelligence. Because of this I appeal to the reader to notice carefully when assumptions are made about what a LT in a given thought experiment can sense, know, or think.

My preaching grows from my own experience. My own modeling within RPM has given me examples of failures to account for powers which I have assumed my LTs possess. So we all need to be on guard with these thought experiments. If we pay attention and leave a list showing each additional ability, we create the condition in which we may gain the most rewarding insights into our human mental and social experience.

Here we may recall an advantage of computerized agent based modeling, as contrasted with the thought-experiment agent based modeling which we commonly employ in this book. A problem arises because natural language words, in which we first consider and then later specify our experiments, commonly express fuzzy meanings. But when we strive to write computer programs to make computerized agents perform as we had imagined they would perform, our unconscious assumptions are exposed as I discussed previously. If you have not done any computer programming yourself, this will probably be attested by any computer programmer you know.

1.2 Rules generally not obvious

The rules are not always obvious. Consider for example the rules which humans have discovered for making iron. I do not know those rules myself so I feel sure I would never have thought of those rules if I suddenly found myself in a stone age setting. We can not assume that the rules needed will simply show themselves to the perceptions of some LT. We must be prepared to search for what is not obvious.

But this idea, that we will search for what is not obvious, might suggest that we do not know what we are searching for. In fact this is typically the case when the goods enjoyed in a later era were discovered through serendipity during the earlier era. No one would have searched for the way to make iron if that one did not know the great uses of iron. And no one could have known the uses of iron if someone had not already made iron.

In some cases, surely, we human overseers can conceive of the pattern of behavior which the LTs in our model will need to discover. In fact this is how we start one of our first thought experiments: with water and sugar laid out so we humans can see it and we know that our critters cannot see it. Then the value of work with RPM reveals itself as we challenge ourselves to give our critters just enough abilities so they can, through discovery of appropriate rules, exploit the RP we can see. But as we seek to model more complex situations there may be some resource patterns, such as make it possible to make iron, which even our human-level conceptions can not perceive, unless we call in a metallurgist.

Other rules, which humans have somehow discovered but which are not at all obvious to me, are: the rules for rotating crops, and the rules that give trading value to money.

Figure 1. The tabletop immediately after a large RP has been added to the initial condition, but before any rules have been learned.

1.3 Rules may appear in sets correlated with RPs

In our first example of rule-based cooperation, there were three rules which acted together to help critters exploit a single RP. In case you do not recall, those rules were:

If you sense water on the left, carry it to the right and set it down.
If you sense sugar on the right, carry it to the left and set it down.
If you get thirsty or hungry, help yourself to what you need from the resources that pass through your possession.

Our purpose here is to notice the bundling of those three rules together. Any one or only two of those three rules would not have been sufficient to assure the prosperity which the three together created. So perhaps we should expect new rules which we may discover to be similarly bundled in sets. Exploitation of a new RP may require adherence to most or all of the rules in a set.

1.4 Rule-sets may overlap

Next we meet another complication, being that these rule-sets may overlap. A particular rule may be included in more than one rule-set, may be applicable to exploitation of more than one RP. This redundancy may lead us into tension between redundancy and efficiency, as we try to design the minds (or the computer programs) of our critters.

Here I also mention the possibility that some rules, drawn from different rule-sets, may contradict each other. But we already have enough complications, so we will defer consideration of this issue.

1.5 Rules may vary with time

Another factor complicates the learning of rules: Rules may change with time. Rules to exploit a given RP can work, obviously, only so long as that RP lasts. In our tabletop thought experiments, a rule to carry water to the right may be beneficial to the community for 100 generations of critters. But when the large deposit of water runs out almost all of our critters will die unless we have given them abilities which have enabled them to discover a different RP, requiring different rules.

1.6 Plans: Rules in packages with time spans

In the simple model of tabletop critters, with which we started, decisions about how to behave were restricted to a single cycle. The whole rule-delimited decision-making process started again in the next cycle (or clock tick) of the model. But in our human lives we know that many of our actions derive from plans which exist throughout a span of time. A single meta-rule, such as to finish a task before sunset, commonly affects many moment-to-moment choices of how to act. So we see that RPM will need to be extended at some stage in advancing research to start exercising an ability to learn not only rules affecting immediate choices, but also rules which act as plans governing sequences of immediate-choice rules.

1.7 The rules take on meaning only within a structure

We may tend to forget that rules, to be effective, need to be expressed within a system that can interpret the rules. That is, if we focus single-mindedly on discovering the rules, we may lose sight of our assumption that there exists an entity which can read, interpret, and act upon, the rules we have discovered. Daniel Dennett writes of a similar oversight possible in genetics, in which the elaborate encoding of instructions within DNA cannot have any use without the other mechanisms in a cell which read and act upon those instructions. (“... the language of DNA and the ‘readers’ of the language have to evolve together; neither can work on its own. ... only the combination of information from the code and the code-reading environment suffices to create an organism.” Darwin’s Dangerous Idea: Evolution and the Meanings of Life (1995) by Daniel Dennett, a Touchstone book, p. 196–7)

1.8 Learning rules enables advance in level of life

In Chapter 4, Life in Levels we reviewed the evidence that life on Earth has grown in levels. We also met the suggestion that we humans, as we go about organizing our affairs with other humans, may well be in the process of creating the next higher level of life. The advance from one level to the next higher level may consist mostly of learning and adopting new rules.

1.9 Rules relate to the type of organization

At this point we should recall that organizations of LTs come in many types. In Chapter 4, we were introduced to an eight-type taxonomy of organizations in which we may categorize an organization by asking three questions about it. The questions address three qualities, asking whether the organization does or does not possess each quality.

member aware. Are the members (constituent lower-level LTs) aware of the organization?
self aware. Does the organization have a “headquarters” which can make some decisions on behalf of the organization?
encoded. Has the way to create such an organization been written down, or encoded, such that new copies can be created by following the encoding?

We should also be aware that the adoption by an organization of a new rule may help that organization to obtain one of the three qualities which we have used to categorize organizations. The adoption of a new rule might change the organization in a way that would cause us to change our yes-or-no answer to one of the three classifying questions. The change, if we think of it as an advance, would probably be from no to yes.

1.10 The learning of a new rule depends upon the kind of LT we have

A new rule might succeed only if the organization in which it is attempted possesses one or more of the three qualities we just discussed. A new rule suggesting, for example, that all members of an organization should be given a new bit of data (data which has become known in some part of the organization) may require that the organization is self-aware, that the organization has a list of its members and a way to communicate with them. Thus, learning of a new rule may require awareness of the qualities of the organization in which the rule may be adopted.

Reviewing these last two points, we have seen that the learning of a new rule by an organization may interact in two ways with the type of that organization:

The learning once accomplished may change the type of the organization.
The learning may be possible or meaningful only for particular type of organizations.

1.11 Learning of rules does not pause for us when a new organization is accomplished

Probably the process of growing level-to-level does not pause when a given level is reached. Our modeling stops the process, like a still photograph, so that we can study the process and give names to things we see. But we may need to remember that the process may flow continuously.

1.12 Any LT must already possess a sub-structure of rules

In RPM, any LT which comes to our attention is probably already following a set of rules. LTs require resources to survive. If they simply do nothing for an extended period they will die for lack of a necessary resource. So a surviving LT must at least have a rule to imbibe a resource when it can. A surviving LT probably also has a rule, or a complex set of rules, to seek opportunities to imbibe.

With the model of tabletop critters we start, as you will recall, in the initial condition with a set of critters already surviving, although poorly. As we have just described, these critters must be following rules, rules which we modelers gave the critters just so that we could start somewhere. We may need to recall this assumption at some stage of our future research.

1.13 A new rule may be optional

Generally the first overall rule for any LT will be to save its life if it can. Since we start most of our thought experiments with a population of LTs which we assume to be surviving, if not yet prospering, each member of that population is probably already gifted with basic rules which enable its survival in dire circumstances. So the new rules which we will give to LTs in our thought experiments will not usually override what the LT already knows instinctively to do in a dire circumstance, but will be applied in better times. A new rule, applied tentatively during better times, may be part of a valuable discovery, as we will consider more in sections 5 and 6. But a new rule will rarely override basic survival instincts.

1.14 A rule may be known by all or only a few

In our first example of rule-based cooperation among critters all the critters in the scene were given the same set of rules. But some rules could be successfully implemented with only a small number of the members of the organization given awareness of the rule. Indeed this follows from specialization in roles, and may suggest further ways to clarify and divide our quest.

1.15 A rule may be known by none of the LTs

We human modelers, having developed a believable story that the critters established a path of trade between water and sugar, may look down upon the critters and say that they have learned the rules. But this “learning” that we perceive is not at this stage perceived by any of the critters.

What is known to each critter in the line consists only of that critter’s memory of successful acquisitions of resources, that is memories of trades or finds in the neighborhood where it lives. No critter has the overview which we modelers enjoy. So no critter has capability to see that it lives in prosperous circumstances, prosperous that is relative to the distant cousins still living as hunter-gatherers nearby. No critter can imagine the RP which we modelers laid down to grow this model of critter-prosperity.

1.16 Rules relate to perspectives

The challenge which focuses this chapter, how will rules be learned, overlaps to some extent with the question addressed in a previous post: How can a LT develop a needed perspective? There is clearly a strong relationship. But for the present we will push forward with discovery of rules without getting entangled in the relationship suggested by development of perspective.

1.17 Rules may be rough at first, but improved later on

The issues which we will face may be divided between two subjects:

learning rules for the first time, that is learning rules of any quality which are at least minimally workable, and
improving rules to make them operate more efficiently.

To illustrate using the model of the tabletop critters, Figure 1 shows the initial condition immediately after addition of a RP. Figure 2 shows a crooked line of trade which may result after the first learning of rules. For a further explanation of this crooked line, see section 2.2.1.

Figure 2. A crooked line of trade which may result shortly after the first learning of new rules.

Figure 3 shows what we might expect after rules have been improved. In the remainder of this chapter we will focus mainly on the first learning of rules. Improvement of rules will receive more attention in the later chapter on Public Psychology.

1.18 Rules may or may not be inherited

We humans seem to inherit some rules as instincts or tastes. These rules, we suppose from the theory of evolution, served our ancestors and we inherited them. But other rules, such as the meaning of a word in a vernacular language, clearly must be learned anew by each generation. We mention this complication but, for now, worry no more about it.

Figure 3. An almost straight line of trade such as we expect after rules have been first learned and then later improved.

2. One way that rules may be discovered: Trade

In this section we will develop an example of one way that rules may be learned, through mutually beneficial trade.

We will use the model of tabletop critters, giving them both ability to trade and a large resource pattern; this is essentially the challenge which we described in Section 2.4.1. Our aim is to show how critters could get from the condition shown Figure 1 to the better condition shown in Figure 2. To some readers it may seem obvious, as it does to this author, that critters with trading ability would, given time, form a line of trade between water and sugar. But other readers may want more analysis. So, in case it does not seem obvious to you, in sections 2.1 and 2.2, I will give a longer explanation.

2.1 Review of initial condition

We start out in a situation much like the model’s initial condition. I will give a review of the initial condition in this section, but a more complete description may be fond in this post.

A sparse population of critters lives like hunter-gatherers on the tabletop. The critters need both water and sugar to live since their bodies consume some of each of these two essential resources in each increment of time. A critter will die if it runs out of either essential resource.

We give the critter a sense of location so that it always knows where it is on the tabletop, that is its X-Y coordinates. A critter also has a sense of touch in the area nearby around its body. Within this sense-area a critter can sense and identify another object which might be water, sugar, or another critter. But a critter has no sense at all of anything outside its sense area. It has no sense of sight.

In each increment of time (each cycle of the model) a critter decides how to act based upon:

its internal state, that is which resource it needs most badly;
external circumstance, meaning where it is on the tabletop and what if any other objects are it its sense area;
relevant memories.

Most of the time a critter will decide to move since only infrequently will it find itself within reach of a consumable resource. If a critter decides to move the direction it chooses will be affected by two factors:

It will move toward where it calculates it has a good chance of finding the resource it needs most. To do this, it will remember when and where on the tabletop it has previously found some of that resource. It will favor moving in a direction toward such a location, and it will favor recent memories over older memories. Perhaps, if it has many memories it will choose a direction toward an average location derived from those memories.
Its direction will be somewhat random. If it has helpful memories and therefore can calculate exactly the location, where it would like to arrive after some requisite number of steps, it will still add some random variation around the direction it chooses for each step as it moves toward that target location. If, on the other hand, it has no helpful memories to guide its choice of direction, it will choose a random direction.

Life is possible in this initial condition on the tabletop because water is deposited by fate onto the tabletop in small quantities and at random locations. Similarly sugar is deposited by fate in other random locations. The deposits of a resource are small, in that the critter which first discovers one can probably consume the entire deposit in one cycle, leaving none either for another critter to find of for itself to enjoy on a later return trip to this location. But the deposits are usually large enough to supply the dietary requirement of a critter for that resource for many cycles. In this way a critter which has just discovered and devoured a deposit of one resource can now go off in search of the other essential resource. With luck it will find some of that other resource before it expires.

In the initial condition, given the circumstances we have sketched, it turns out that a critter cannot make much use of its ability to remember where it previously found a resource, because the deposits fall in random locations and are usually consumed entirely upon first discovery. A critter’s best strategy is probably to just keep moving about randomly and hope for good luck. But the critter’s ability to remember details about its previous finds will become crucial, next.

2.2 Extend the model with a Resource Pattern and new critter abilities

Now we add a large resource pattern as described previously. Recall that the water and sugar in this large RP are further apart than a critter can expect to travel in its entire lifetime, so no critter could flourish alone by journeying between water and sugar. The large resource pattern can be seen in Figure 1.

We also give new abilities to the critters. In addition to the abilities inherited from the initial condition each critter will be able to:

calculate its days-on-hand for each of the two essential resources. For each resource, this involves dividing the amount the critter has in its internal store by the amount which the critter’s body consumes during each cycle (day). For example, if a critter has eleven units of water in its internal storage and if its metabolism uses one unit of water during each cycle, then it will calculate that it has eleven days-on-hand of water.
compare the days-on-hand for each of the two essential resources and thus calculate terms under which it would trade. It would seek to trade a quantity of its more-plentiful resource for a quantity of its more-needed resource. Perhaps it will accept any terms which would extend its expected days of survival based upon both resources.
display willingness to trade by displaying a symbol, readable by any other nearby critter, signaling which resource is offered and by implication which is sought.
recognize another critter with opposite (compatible) trading aim
negotiate, agree if possible, and complete a beneficial trade.

2.2.1 The critters can and eventually should establish a trading path
Now, with the large resource pattern in place and with critters given ability to trade, we will continue our thought experiment by predicting what we expect to happen in the neighborhood of the large deposit of water on the left.

Consider Figure 4. Around the large water drop we have sketched three regions. We will assume that a given critter might travel about as far as the distance between two regions. Critters can and will wander across these region boundaries which we imagine, but we will assume that no critter which starts in Region A could wander as far as Region C.

Figure 4. Regions around large deposit of water. The critters become water-rich, first in Region A, then B, then C.

For a critter who starts out in Region A (nearest the large deposit of water) we expect this critter may soon stumble into that deposit. It will remember the location and will return there in the future when water is once again its primary want. But this critter must still find sugar in order to survive. So, after the passage of some time, we expect that most critters in Region A will spend most of their time looking for sugar by foraging outward from the water drop.

The principal source of sugar in Region A continues, for now, to be what it was in the initial condition – the random small sprinklings of sugar dropped by fate onto the tabletop. But there is now an additional way that a critter might get sugar: through trade with another critter which it might meet. If that other critter has happened recently to discover an abundance of sugar but not water, it will probably agree to swap with our water-rich critter. That other critter will remember where it met our water-rich critter and will be inclined to return to this spot if in the future it once again finds enough sugar to be in a position to want to trade sugar for water.

Now consider the critters foraging in Region B. At the outset we expect that about half of them will be in greater need for water than for sugar; that is just a continuation of the way it always was in the initial condition. But after the addition of the large RP, we expect some Region B critters who have more days-on-hand of sugar than water will have met, when they wandered in the direction of Region A, a water-rich critter willing to trade. These two will complete a trade and remember the location. In this way after the passage of still more time we expect that Region B, like Region A before it, will come to be populated by critters who are relatively rich with water and who are seeking sugar.

With a similar story repeated for Regions B and C, we expect the passage of yet more time will generate in Region C a population more likely to be seeking sugar than water.

None of these critters which needs sugar more than water knows, we know, about the huge deposit of sugar far away to the right (in our picture), so the sugar-seeking foraging of critters outward from the water will occur in all directions around the water.

Now we assume that a similar thing will have been happening around the sugar, but with the resources reversed: In all directions outward from the large deposit of sugar will live a population of critters more likely to be seeking water than sugar.

Eventually, in the area between the water and sugar, a critter which happens to be bestowed with water which came originally from the large water deposit will meet a critter which happens to be bestowed with sugar from the large sugar deposit. A trade will occur. We will call this the first trade which is enabled by the large RP. The two participants in the first trade will remember the location and each will be disposed to return to that spot when once again it needs the resource it received there.

Of course those two original partners to the first trade may chance to meet again at that spot. But note that our critters are not loyal to particular trading partners; any other critter which randomly goes within reach of that first-trade spot when either one of the first-trade partners has returned may also benefit from a trade at that spot. For all critters in that region, a trip to that spot has a better chance of giving a successful trade than a trip to some other randomly chosen spot in the region. The first-trade spot seems likely to become a location remembered favorably by several critters in addition to the first two trading partners.

As we have described the circumstances leading up to the first trade, we might expect the location to be on the shortest line between the water and sugar, because a point on that shortest line would be more probable than other points in that area. But it is unlikely that the first meeting would occur exactly on that shortest line. We expect rather only that the trade will be somewhere in a more broadly defined area, as shown in Figure 5.

Figure 5.

We expect a regular trade route, engaging many more critters than the two partners in the first trade, may be established through the first-trade location, as shown in Figure 2.

2.3 Reflections upon this model

2.3.1 What we see, what critters cannot see
At the end of the above thought experiment, pictured in Figure 2, we human modelers can say that the critters have learned rules which enable their cooperation in trade. We can say this because of the evidence of prosperity that we can see:

We see critters clustered in a line of trade. This is a configuration of critters on the tabletop which could not have occurred in the earlier initial condition, we know, because such a density must have something feeding it and the initial condition provided no such concentrated nourishment.
At the ends of that line of trade we can see the supplies of water and sugar which we know support the line.
Our modelers’ record keeping lets us see that critters in the line live longer lives than their unorganized kin who still live as hunter-gatherers in remote regions.
Our oversight also informs us that critters in the line almost never die of starvation for an essential resource. But the fate of starvation is the most common death experienced by unorganized critters in remote regions, as we can see in our model’s data record.

We humans are able to see these things because we made this model, after all, so that we can cultivate and observe behavior in populations composed of individuals deliberately simpler than we are. We humans have advances senses, language, and powers of cognition.

But, while we humans can see prosperity gained among our critters, no critter can see any of the evidence of prosperity listed above. To understand why, recall that at this stage of the model development we have given the critters no more than a list of specific abilities, as outlined in sections 2.1 and 2.2 above.

If this is not already clear, consider any critter in the established line of trade. This critter cannot see or know about:

it is in a line of critters, a line that extends between large deposits of water and sugar.
it has a better life than unorganized critters, better that is than its ancestors had before this line of trade was discovered and better than its kin-critters still living in the outlying region
it does not know, when it obtains a resource in trade from another critter, where or how the other critter got the resource. Such speculation is far beyond the calculating capacities we have given critters at this stage of development.

We conclude that perception of prosperity among LTs is not necessarily possible for all LTs. In this example the perception is possible for us humans but not for our critters.

2.3.2 The critters’ “learning of the rules” is distributed among the many critters in the line
What does it mean that LTs so simple-minded as our critters have “learned the rules”? To find the answer we use our modeler’s-eye view to look into the memories of critters. We find useful memories. A fortunate critter in the line has many memories of acquisitions which were recent and nearby. Whereas the less fortunate unorganized critter has fewer memories and those more remote in both time and distance.

So the information which gives rise to our perception that the critters have “learned rules” exists only in memories distributed among critters. We human modelers tend to think of the critters’ learning of rules as one accomplishment. Indeed – in our view – it is one accomplishment. But in the critters the body of information is divided among all the critters in the line, each critter possessing only a list of favorable memories of trades.

2.3.3 The organizational classification of this line of trade
We can now classify the line of trade based upon the discussion just completed. In section 1.9 above, as you may recall, we reviewed a way to classify an organization into one of eight types. The classification hinges upon three yes-or-no questions. Asking those questions about the line of trade, we see it is:

not member-aware as argued above. No critter knows about the line of trade.
not self-aware. Obviously the line of trade has no central office.
not encoded. There is no encoding, telling how to create a line of trade, which was created by or is accessible to the critters.

So the three answers are all “no”. But in the sections and chapters ahead we extend the abilities of the critters to discover and exploit resource patterns. There we will observe that these extensions lead us to answer “yes” sometimes, to assign the organizations so formed in different categories.

2.3.4 Rule-discovery through trade stands upon a sub-structure of rules
We should pause to notice an assumption we have made to reach this point. We gave critters ability to negotiate and complete a mutually beneficial trade. But this was a simplification. A trade would not happen all at once but rather would take place over a number of steps, as a sequence of choices and acts each of which was made based upon what the trading partner had done immediately before.

This complication would surely be discovered by a human modeler who was trying to create the effects we discuss here in the form of a computer program. A critter’s computations, needed to complete a self-serving trade, may well require steps and decisions of which the modeler had been unaware until the modeler realized his program could not achieve the desired result without specification of underlying and necessary behavior.

Even though the title of this chapter promises to explore how rules will be learned, here once again when we examine our assumptions we see that we are giving rules to the critters, rules which act at a lower level in the organizational structure. At this stage we will not pretend to know how critters might have learned these lower-level rules, if we had not made a gift of those rules. Rather we will grope ahead, working with concepts which I admit are vague for the time being.

3. Ethical Interactions

3.1 Ability to cheat trading partner

Now we will call into question one of the many rules which we have assumed necessary to complete a mutually beneficial trade. This is the rule to give what has been promised to the trading partner.

In the division of trade into a sequence of steps through time, it will usually occur that one partner gives a resource first and is thus for the moment exposed to risk until the other gives also in exchange what has been previously agreed. Thus the partner who gives second has an opportunity to cheat, to take what has been offered while not reciprocating.

Notice that if a critter did cheat in trade then the cheated partner would not gain a favorable memory of successful trade at that spot. The cheated partner would have no reason to return to that location in the future when it needed the resource which it had failed to obtain. So, without mutually beneficial trade the line of exchange would never form. If a line had formed and consistent cheating started within that line then the line would dissolve before long.

The only lines which continue, it would seem, are those with uniform or almost uniform fidelity. We may guess at this stage that some small degree of cheating would not destroy the line of exchange, but this raises complex questions beyond our present scope. Here we will conclude only that in order for a successful community of trade to form then the critters, when given ability to cheat in trade, learn a rule not to cheat in most opportunities.

3.2 Ability to overpower and eat others

Next we will momentarily consider an extension of the critters’ abilities. This extension, which will bring the model one step closer to our human experience, gives the critters power to prey upon one another: When circumstances allow one critter (or organization of critters) may kill and eat another critter and thereby gain water and sugar from the body of the victim. The body of a critter is, we see now, a resource pattern which may be discovered and exploited by other critters.

With critters in our model thus enhanced, suppose we start again with a situation such as pictured in Figure 1. It seems unlikely that critters will form a line of trade as shown in Figure 2, because any predictable density of critters would represent a resource pattern inviting other critters to return to that locale looking for critters to eat. Also, when a critter is eaten, its memories of favorable trades, the memories which gave rise to the line of trade between water and sugar, disappear from the tabletop along with the unfortunate critter.

So here again, as when we gave critters ability to cheat, we will give our critters ability to learn an ethic. This ethic, when learned, will deter critters from killing and eating each other in most normal circumstances. Once again then it will be possible for critters to discover a prosperous line of exchange, assuming we give them enough time during which they will test both using and not using this new ethical rule.

4. Population and Resource Accounting

Our discussion above shows that the presence of a large new RP is not sufficient for prosperity. Prosperity can follow only if the critters learn important rules. But still that large RP is necessary for the prosperity as can be shown by a type of accounting.

We can see this accounting by reviewing the example of the initial condition on the tabletop. As we told that story, fate dropped a morsel of water here, a morsel of sugar there, in spots selected at random. The rate of this dropping of resources limited the size of the population of critters. The body of each critter consumes a certain amount of water and sugar in each cycle. So, on average, the amount consumed by the entire population of critters per cycle must be less than or equal to the amount dropped by fate per cycle. The size of the population is limited, as this accounting shows, by physical facts.

Next, as our story of critters has developed, we added a large new RP with the water and sugar. For our present purpose, which is to learn how new rules to exploit a new RP may be learned, we can safely assume the large new RP is unlimited. We assume the large new RP will outlast our present experiments in rule-learning. Eventually of course the large RP may be consumed. That prospect, and the challenge it represents, lies outside the scope of this chapter, but it received some consideration in an earlier post.

5. Other Rule-Learning Paradigms

We have seen in the preceding discussion that rules may be discovered spontaneously by a population of mutually-respectful traders. Now we will turn now to consideration of other ways that rules may be discovered.

The other ways which we will consider here all require at least a little wealth on the part of our LTs. As we mentioned in section 1.13, a new rule may be optional. When a LT is near starvation for want of an essential resource — when it is most certainly not wealthy — then its behavior will probably follow primitive rules applicable in an emergency. In order to survive it must bet its life upon the most immediately promising pattern of behavior. But as we have seen previously there may be opportunities on the tabletop when large gains can be had if new rules can be discovered. As such, when LTs have at least a modest store of all essential resources, it will sometimes be the best strategy for the LTs to act in ways which increase their mid-to-long-term prospects of discovering and exploiting new resource patterns, that is of discovering new rules.

5.1 Rules may seem evident, or self-evident, to all

The first category we will consider will be a circumstance in which all the critters can sense the resource pattern. It will still be the case that a single critter cannot exploit the RP alone, because the water and sugar are too far apart, so prosperity can ensue only if these critters discover rules of cooperation.

For a specific example, assume that we start with trade-enabled critters such as we used in section 2.2. We then give each of the critters a sense of sight such that any critter can see the mountain (relative to the size of a critter) of water in one direction and the mountain of sugar in another direction.

Finally we give the critters these additional rules.

In fortunate circumstances, in which you are not in immediate danger of dying for want of a resource, pause to look around.
If you see large deposits of both water and sugar, note their directions.
Add a bias to the direction you will choose for your future wanderings, such that you will move eventually into some position between the water and sugar.

Once a few critters have so moved themselves between water and sugar it should not be long before mutually beneficial trades lead to a dense population between water and sugar, with members of this dense population all benefiting from both the water and sugar in large RP, as we see in Figure 3.

Let us pause to consider the question of whether the success, which we see at the end of this thought experiment, can be called the result of a conscious plan. We human modelers are conscious of a plan in at least one human head. I made up those additional rules, after all, to give the critters what it would take to bring about the critter-success I seek.

But none of the critters is conscious of the plan because consciousness is not among the capabilities which we have given them. The critters “minds” are simple computer programs. The critters follow the given rules without awareness, as a walking human takes each step without being aware of the rules which must be guiding his nervous system as it sends instructions to the many muscles involved. So we conclude that yes, this success resulted from a conscious plan in a human mind, but not in any of the critters’ calculating capacities.

5.2 Rules may be proposed

Next we will consider the possibility that rules may be proposed by one or more critters. Some few among the critters may have special gifts which enable them to propose rules to the larger critter community. For example the gift of sight which was enjoyed by all critters in the previous example might be given to only one of the critters. This one critter now can see the RP, the mountains of water and sugar.

While this seems like a step toward solution of the problem, it should be obvious that the perception in the sight of the one gifted critter must be translated into messages which will be communicated to and understood by many other critters on the tabletop. The critters need a language sufficient for this communication. We must give substantially more powers to the critters if they are to act this way to exploit the RP.

This need for language, while a most interesting and important subject, would require a large leap in the calculating capacity of the critters. It raises many questions outside the scope of this chapter. But it should be worthwhile to highlight a few points here:

Our present scenario, with one critter gifted in a way that may be valuable to others, introduces specialization in our model. Specialization on a grander scale is of course what we humans do as we divide ourselves into professions and working roles.
In order for specialization to work, trust may be needed. Critters without the gift of sight, if they are to follow the suggestions of the gifted one, must trust that the gifted one is telling the truth. But trust must be balanced with healthy skepticism because once again we see the possibility that one set of critters may exploit another set. A too-trusting critter, expecting to move into a position of prosperity may instead move to where it is killed and eaten.
In an advanced application of RPM, LTs may pay each other for goods and services. Critters who want good advice about where to go might pay sighted critters for the information. Or sighted critters, if they should happen to be wealthy as well, may act as entrepreneurs, paying blind critters to move into positions which will prove profitable, and then reaping the profit.
In other advanced applications of RPM, critters that specialize in carrying messages may help a larger community. These message carriers and the problems they encounter in RPM may help us humans gain fresh insights into the character of our own media.

5.3 Avocational choices

Humans who are wealthy have some free time. Wealthy people undertake many various avocations, we know, and some of these avocations might result in discovery of new rules. Notice that many of us acquire knowledge not for any immediate gain, but for the mere pleasure of it, we might explain. Then recall that plots of fiction sometimes show a human population surviving in the end because a single member had cultivated a unique expertise, and used that expertise to save the lot. Surely this sometimes happens in reality.

We suppose therefore that a population which contains many members with special knowledge, albeit farfetched, has a better chance of surviving unpredictable future crises. Thus we should not be surprised if our evolved human dispositions lead individual humans into various and odd-seeming avocations. We can see the variety of avocations as experimentation undertaking by the whole population. And, if our modeling with RPM ever gets so advanced that we have the option of imbuing individual LTs with distinct tastes for avocational pursuits, we might in this way unleash great powers for new rule discovery.

5.4 Variability within families

The propensity which we have just considered, for LTs to spend some of their wealth experimenting with avocations, may have a parallel expressed as variability within family groupings.

I will introduce this idea with an analogy to how humans invest their financial savings. We recall that:

Among investment opportunities, there is typically a tradeoff between risk and return. Conservative investments which are quite certain to preserve their value also promise little gain. Risky investments, on the other hand, which may lose most of their value, also give hope of great gain.
Among investors there is variability in the amount of risk they can tolerate in their portfolios. An investor who is only a slim margin above poverty, who has only one week’s pay to invest, should probably invest it conservatively, with low risk but high security for the principal. But an investor who has become much more wealthy, an investor who has already saved about ten years’ pay, may be wise to invest some in projects with both higher risk and higher expected return.

Now we consider a parallel in human families. Each child, I suggest, may grow up imbued with a characteristic motivation, a motivation to be either conservative or risk-seeking, or somewhere in between. Conservatism in this context will consist of adopting ways known to promise survival, that is to adopt the basic values expressed by the lives of the parents. A risk-seeking child, on the other hand, may feel less reverence for the values expressed by the parents’ lives, and may be motivated to break far away from the pattern of life demonstrated by the parents. The risk-seeking child will more likely fail but, if he finds success, my find high-returns.

Consider a single couple and the children which that couple may bear. The first child may be the only child and, as with the first savings, it makes sense for this child to be a conservative investment within this scheme I am suggesting. This first child will probably be imbued with the sort of conservative motivations just mentioned. But after our couple has a few children, or more, growing healthily, then the continuation of the family line seems secure. So then it makes sense for that family unit to invest in more risk-seeking children. That is, younger children in large families are more likely to aim their lives away from family tradition. Or at least this is what I believe I see in my own informal survey of the human families I have known: The oldest tend to be motivated conservatively, in the sense described, and the youngest tend to be risk seeking.

This applies to RPM in that we may vary the motives implanted in offspring. If and when we are modeling societies that can imbue their offspring with varying motivations, on this conservative to risk-seeking axis, we may do well during prosperous times to test more risk-seeking in offspring.

6. Charity may lead to rule discovery

Now we will consider charitable giving among LTs. We may expect giving to increase as the wealth of the givers increases. Charitable activity may contribute to the discovery of new rules through a process which shares features with the rule discovery we saw in section 2.

Consider this example. Suppose that a critter which lives near the mountain of water, and so has great wealth in ability to acquire water, has also been unusually fortunate recently in finding sugar. This critter, feeling very secure, enjoys some freedom of choice about how to behave during the immediate future. The critter may choose to spend some of its time carrying water (the resource which it can acquire in super-abundance) outward, away from the mountain of water, into the surrounding desert. This makes sense since sometimes, out there in the surrounding desert, one meets another critter near death for want of water. A charity-motivated critter feels reward upon giving water to another critter so needy.

This motive, or rule as we may call it, to give away a portion of the resource in greatest abundance, during times of prosperity, may be coded into the critter’s computer program by us human modelers who are trying to create a model of human charitable giving. Or if we are working in thought-experiment mode we might assert that this motive had been favored by natural selection and thus inherited by the critter in its genes.

In either case, suppose our critter aspires to help as many near-death-for-want-of-water critters as possible. Suppose that it has to decide which direction to travel as it heads away from the mountain of water in search of water-needing critters. I would expect it to find a higher density of such water-needing critters in the direction of the large deposit of sugar, because critters in that direction have a lower probability of dying for want of sugar. We suppose our charity-motivated critter remembers, as did the trade-motivated critter of section 2, where in the past it has met water-needing critters. So the charity-motivated critter will probably head away from the mountain of water in the direction of the mountain of sugar. But of course it cannot see the mountain of sugar or reason, as can we humans, about the cause of the greater density of water-wanting critters in that direction.

After passage of more time on the tabletop, our water-charity-motivated critter, when heading toward the mountain of sugar may meet another critter not motivated so much by want of water as by ambition to give away some sugar. That is, charity-motivated outreachers may meet somewhere in the middle between the two essential resources. Although our water-charity motivated critter was not seeking sugar at this time, it will remember the spot as a favorable location to visit in the future when sugar is again in short supply. These critters, first motivated only by charity, may thereafter fall into mutually rewarding trade. This story of charity-motivated discovery seems to have much in common with the trade-motivated story we developed in section 2.

7. Conclusion

We will conclude this chapter on learning of rules with a reflection upon the ways that LTs can acquire rules. Consider three levels of rule acquisition.

In the first level we modelers use our overview to see what rules will enable critters to find success, and we code those rules into the critters program. For example, in the initial condition we coded our critters to keep moving so they would have a better chance of stumbling onto the randomly dropped resources. For another example, we first demonstrated the power of rule-enabled cooperation by giving the critters the rules they needed to exploit a large RP: carry water right, sugar left, etc.
In the second level, which has been the emphasis of this chapter, we modelers again use our overview to see what rules the critters need. But in this case rather than code the rules directly into the critters’ programs or instincts, we give critters memory and calculating capacities which we, again using our overview, believe will enable the critters to discover the rules for themselves. This exercise places us humans in a laboratory in which we can learn important lessons about ourselves and other LTs.
In the third level we see our ultimate challenge. In this level we modelers do not perceive the next RP that our critters may discover and exploit. We strive to empower the critters to discover their own RPs, because either we modelers lack perceptual powers to see the RP or we cannot be with the critters, wherever they may be, to offer our guidance. In this we see that RPM seems to frame the epistemological problems of our human existence.

Pages

Wednesday, March 9, 2016

The Learning of Rules

The Learning of Rules

1. Complications of learning rules

1.1 Care and conscious attention must be exercised when giving new abilities to LTs

1.2 Rules generally not obvious

1.3 Rules may appear in sets correlated with RPs

1.4 Rule-sets may overlap

1.5 Rules may vary with time

1.6 Plans: Rules in packages with time spans

1.7 The rules take on meaning only within a structure

1.8 Learning rules enables advance in level of life

1.9 Rules relate to the type of organization

1.10 The learning of a new rule depends upon the kind of LT we have

1.11 Learning of rules does not pause for us when a new organization is accomplished

1.12 Any LT must already possess a sub-structure of rules

1.13 A new rule may be optional

1.14 A rule may be known by all or only a few

1.15 A rule may be known by none of the LTs

1.16 Rules relate to perspectives

1.17 Rules may be rough at first, but improved later on

1.18 Rules may or may not be inherited

2. One way that rules may be discovered: Trade

2.1 Review of initial condition

2.2 Extend the model with a Resource Pattern and new critter abilities

2.3 Reflections upon this model

3. Ethical Interactions

3.1 Ability to cheat trading partner

3.2 Ability to overpower and eat others

4. Population and Resource Accounting

5. Other Rule-Learning Paradigms

5.1 Rules may seem evident, or self-evident, to all

5.2 Rules may be proposed

5.3 Avocational choices

5.4 Variability within families

6. Charity may lead to rule discovery

7. Conclusion

No comments:

Post a Comment