Capabilities and Class Extensions for
Secure but Extensible Shared Smalltalk

Tony Hannan
Georgia Tech CS 7001
MiniProject 3
Sponsor: Mustaque Ahamad
December 2003

The Smalltalk programmer/user has complete autonomy to change any object/class in his system.  This becomes a problem when trying to share objects between different Smalltalk systems, because each system may have different versions of common classes and expect different interfaces from them.  The only reliable way to share objects is to prevent separate versions of common classes, but this is too restrictive since it wouldn't allow benign extensions.

The solution presented in this paper is to make common classes unchangeable, but allow benign class extensions to them as separate entities.  Security is enforced by using read-only capabilities [E] between classes in a single shared object graph.  Portals, pools, class extensions, typed selectors, and transactions are also added to this object graph to enhance capabilities.  The user may only work with objects attached to the portal he is logged into.  Most portals will have access to new class creation and benign extension creation of read-only classes.

Physical distribution of this shared object graph is not discussed in this paper, only the architecture of the capability-based object graph itself.

1  Smalltalk Extensibility:  Flexible Interfaces

Smalltalk [St-80] is a pure object-oriented language and system.  Everything is an object including windows, processes, and source code, and all objects are editable within the system, including classes and their methods.  When a user changes a class/method he is effectively creating a new version of that class, different from the one released with the system, and probably different from every other users' version of that class.  These different versions of common classes make it hard for systems to communicate with each other safety.

A user's code is written to work with his own versions of common classes.  For his code to reliably work with objects from other systems, those objects should adopt the class versions of his system, since that is what his calling code expects.

1.1  Rejected Architectures:  Inextensible Interfaces and Insecure Dynamic Environments

CORBA [OMG], RMI [Java], and other forms of remote procedure call do not adopt the class versions of the calling system but define explicit interfaces that both parties must adhere to.  In other words, a user is not allowed to call his custom methods on objects that may be remote, and he must not remove methods from objects that may be used remotely.  The least-common-denominator interface must be used between systems.  These fixed interfaces definitely help, as evident by the popularity of  component architectures, but they hinder extensibility.

Smalltalkers are used to extending common classes (extending their interface) and would like to call the extended behavior on instances without worrying about where they are located.  One solution, as implied above, is to associate the caller's environment (set of classes) with the process.  When a remote object is called its class is looked up in the caller's environment and executed.  Associating the environment with the process rather than the object is an extreme form of late binding, and would not be too hard to implement efficiently [1].  With this architecture, the contract is the instance variable schema between the two different versions of the receiver's class.  As long as the schemas are the same (and their meanings are the same) the contract is satisfied.  Even if they are different, a schema migration converter can be used.  A caller is still calling an interface, it's just that he can now modify his version of the interface/class and the receiver will use it no matter where it is located.  This architecture seems promising until you consider security.  It could work with access control security, but fails miserably with capability-based security, as we will see below.

2  Preferred Security Model:  Capabilities

In this section we consider different security models, and choose the Capabilities model.  Then, in section 3, we present a new architecture for secure but extensible classes using capabilities.

2.1  Access Control Lists:  Inappropriate for Objects

Access control is a common security model that allows you to associate resources with users and the rights they have on each resource.  The resources can be anything like a file, an object, or a group of objects.  Gemstone [Gem], a multi-user Smalltalk database system, groups objects into segments and enforces access control on each segment.

Access control has problems like the "confused deputy problem" [CDP] that capabilities solves.  Also, access control does not fit well in an object model because objects don't map naturally to aggregate resources such as segments, and having access control on each object would be too much overhead.  Finally, objects encapsulate each other and delegate to each other.  The user is disjoint from objects being delegated to and therefore should not need access control specified for them.

2.2  Capabilities:  Object Encapsulation Automatically Provides Security

Capability-based security [E, Cap] takes advantage of object encapsulation and delegation.  An object is automatically secure because its instance variables are private.  An object's methods themselves define the capabilities (access rights) of its clients objects that are pointing to it.  No extra security mechanism is needed.  The object model itself is the security mechanism.

The object system must enforce four simple rules for capabilities security to work: (1) the inability to fabricate pointers, (2) private instance variables, (3) code must only send messages to other objects or otherwise be trusted, and (4) no global variables.  In type-safe languages like Smalltalk, pointers can't be fabricated.  Smalltalk's instance variables are private, except for a few meta-operators which we will move to separate limited-access reflection objects.  Smalltalk only works by sending messages or executing special primitives.  Primitives will have to be trusted, see section 2.5.  Smalltalk has global variables, which we will get rid of and replace with limited-access pools, see section 3.2.

The object model provides the security architecture, but the user/programmer provides the security by carefully handing out the right capabilities to right clients, only giving them what they need and no more.  For example, if a client object only needs to create instances of a class, don't give it the whole class, just a "capability" object that understands #new, which upon invocation calls #new on the real class held privately in its instance variable.  Giving out limited capabilities conforms to the principle of least privilege [POLP] and is key to the success of capability-based security.

If you have to give out stronger capability objects, make sure the client is trustworthy, since it may pass the capability along to other objects.  It is possible to create smart capabilities that confirm the sender at runtime, but this requires maintaining an access control list that is usually more trouble than handing the capabilities out correctly in the first place.

A fundamental difference between the capabilities approach and the access control approach is their view of who the client is.  Is it the object sending the message or is it the end-user who triggered the chain of messages in the first place?  Capabilities takes the point of view that the immediate sending object is the client.  For example, Capabilities allows "kernel" mode operations to be triggered by user processes without switching modes, because the kernel trusts the object that is handling the users request.  The kernel knows that the user can only get kernel access through these handlers.  The access control point of view requires the handler to start a kernel mode process to fulfill the users request.  The kernel is trusting the handler anyway to start a kernel mode process.  Capabilities recognizes this delegation of responsibility and distributes it over the entire graph of objects, eliminating the need for a separate security mechanism.

2.3  Information Flow Security:  Requires Static Typing

Another approach to security currently being researched is information flow security [IFlow].  Information flow security annotates variable types with confidentiality and integrity constraints that look like access control lists.  The compiler checks these constraints when type-checking and makes sure information cannot flow from higher security types to lower security types.  Type-checking an entire system from end-to-end ensures end-to-end security.  In terms of capabilities, the compiler will tell you if your client objects are leaking your capabilities to the wrong objects.

Smalltalk does dynamic type-checking instead of static type-checking, so we have to rely on the principle of least privilege, and carefully hand out stronger capabilities only to trusted clients.  To enhance this trust, we will prevent changes to classes so imposter methods can't be added that will leak capabilities.

2.4  Security in a Distributed Environment:  Trusting the Machine

The machine itself has full capability to objects residing on its hardware.  A sufficiently motivated engineer could "decompile" the bits residing on his machine and figure out an object's private instance variables.  Realizing that the machine has full access to its bits, means you have to treat the machine like any other client.  You either give it weak capabilities that you don't care if they get leaked, or you trust it and give it stronger capabilities.

Instance variables being transferred to another machine have to be considered public, since the machine can read them.  If the instance variables point to remote objects that you don't want the untrusted machine to get a handle to, then you can't transfer the object to it.  Instead, you must transfer a proxy that forwards messages to the object kept locally.  This way its private instance variables are kept secure on your machine.  We can automatically determine if an object's instance variables can be transferred with the object or if a proxy should be transferred instead:  If one of the object's instance variables is truly private, i.e. no methods give it out, then only a proxy to the object should be given out.  Otherwise, it is ok to transfer the object and its instance variable, since it is willing to give out its instance variables anyway.

Finally, in a distributed environment with remote pointers, it should be impossible for a machine to create a remote pointer without being given one.  In other words, it should be impossible to guess a remote pointer, even after seeing many others [E].

2.5  Trusting the Code:  Bytecodes and Trusted Primitives

In order for methods to be treated like any other object and trusted to migrate/replicate to other machines, its code must be restricted or trusted.  Most code is restricted by representing it in an intermediate bytecode language which only allows manipulation of the receiver's instance variables and message sends to other objects it has access to.  This bytecode language usually contains other constructs like jumps and stack access in order to be practical, so the method has to be verified to make sure the jumps don't leave the method and the stack accesses are not out of bounds.  This verification can happen when a method is migrated/replicated or when it is first called.

Primitive methods are low level operations that support the object illusion.  They have full access to machine code which could be used to violate object rules, so they have to be trusted.  A primitive method may not migrate/replicate to another machine unless it is given out by a trusted authority.

3  Secure Object Graph:  Read-Only Classes and Benign Extensions

Revisiting which class version a remote object should adopt in light of capability-based security, we see that an object should not trust clients to supply its own class version.  The client's class version may expose private instance variables, or leak message arguments to other parties.  If we want to provide the flexibility of client-defined extensions of receiver classes, then we must restrict these extensions to benign behavior.  This can be done by only allowing extensions that do not override existing methods and do not access instance variables.  Furthermore, to ensure that the base class is the same in both systems, we have to prevent changes to common classes.

To enforce read-only common classes we use capabilities to limit access to them so they only can be instantiated and inspected.  Since common classes are the same across users we will model them as actually being identical objects that are replicated but kept in sync [2].  So separate Smalltalk systems really become one shared distributed system, which we call SharedSmalltalk.  This single shared system allows us to concentrate on securing a single object graph without worrying about different user versions.  This object graph is distributed among the users.  How it is distributed is orthogonal to the secure object model itself, and is not discussed in this paper.  Efficient and transparent distribution of objects is actively being researched and is discussed in many papers like [Globe], [WebOS], and [Thor].

The rest of this section, which is the focus of this paper, describes the new objects needed to make the shared object graph capability-secure with respect to classes while allowing benign class extensions.  The new objects are: user portals, class pools, read-only classes, class extensions, typed selectors, and transactions.  Each is discussed below.

3.1  User Portals:  Users' Connection to the Graph

Users are represented as objects just like any other object and have their own capabilities.  The user's object is called a portal because it is his portal to the object graph.  Only objects that his portal points to are accessible to him, conforming to principle of least privilege.

Every SharedSmalltalk machine has a login object that points to available user-login objects.  A user-login object returns its user's portal upon receiving the correct password.  This portal contains pointers to all objects accessible to the user, including objects that may be shared among other portals.  Some objects in the user's portal may be displayed on the screen, others may be accessible by menu or by name typed into a command line.  Messages can be sent to these objects by clicking buttons or by typing messaged into the command line.  Results if any may be displayed, or named for future command line use.  Notice, in a multi-user environment, there will not be a single display, so no side-effect drawing can be done unless the display is submitted with the request.

The user may create objects that represent himself and give them out.  For example, he may create a chat object that privately hold the user's display and responds to sendText: messages by displaying the text on his display.  He may add this chat object to a shared object that his friends also point to, so they can find it.  He may also send it directly to another person if he has his email.  A SharedSmalltalk email is an object that privately has a pointer to the user's mail box, which could just be a window in his display.  It responds to sendObject: by adding the argument to the user's mail box.

3.2  Pools:  Hierarchical Namespaces

Current Smalltalk keeps all classes in a single global dictionary.  There is no concept of package, although some dialects do support packages/modules/namespaces [VisualWorks, SmallScript].  Following the principle of least privilege, a client class should only have access to classes it needs to collaborate with.  These collaborating classes can be held in a pool that the client class has access to.  A pool is a collection of named objects.  Each class will hold such a pool, abolishing the global dictionary.  Pools are the same as pool dictionaries, used sparingly in current Smalltalk, except pools can inherit from other pools.  All objects in a pool's hierarchy are visible to clients of the pool.

Classes in current Smalltalk also have class variables, these are static variables visible to all instances of the class and its subclasses.  In this sense, a class acts like a pool.  Its class variables and class variables in inherited classes (pools) are visible.  So in SharedSmalltalk, a class is a pool, and class variables, pool variables, and global variables are consolidated into just pool variables.  A class may contain named objects, and it may inherit from zero or more other pools, one of them being its superclass.

Pool elements are named object rather than variables, meaning you cannot assign to them.  However, some of the objects may be holder objects so you can get and set their contents.  Again this was done to follow the principle of least privilege.  Likewise, you may or may not be able to add/remove elements from the pool, depending on what capabilities you have to it.

Pools are like packages/modules but simpler and more flexible.  A class can be in more than one pool and a class may have its own imports separate from other classes accessible from the same pool.  In other words, the class is the module and its pool are its imports.  Thanks to inheritance of pools most imports don't have to be restated.  The root Object class will import a pool containing common classes like the collection classes.  This way subclasses will automatically inherit access to these common classes.

3.3  Read-Only Classes:  Safe Sharing of Behavior

In keeping with the principle of least privilege, most classes found in pools with be read-only capabilities to their real class.  They will only respond to instance creation and method viewing messages.  Proprietary classes may not even allow viewing of method code, just names and comments.

Since read-only classes are responsible for instance creation, specific instance creation methods will reside in them instead of in metaclasses.  In current Smalltalk, the class of every class is a unique subclass of Class called a metaclass.  The metaclass hierarchy parallels the normal class hierarchy, so class methods can inherit from their super just like instance methods.  Class methods like "new" and "methods" are found in Class, but each class may add its own methods in its metaclass.  Usually only specialized instance creation methods are added to metaclasses.  In order to support the same inheritance of instance creation behavior, read-only classes will have an inheritance hierarchy paralleling the real class hierarchy.  The difference between read-only classes and metaclasses is that read-only classes are not also the classes of the real classes.  Real classes will now be direct instances of Class, abolishing the need for metaclasses.

3.4  Detached Class Extensions:  Benign Extensions to Read-Only Classes

Often a client class wants to add some convenience methods to a class that it is collaborating with.  Traditionally this required changing the collaborating class, which in a shared environment causes problems as described in the introduction of this paper.  In SharedSmalltalk, most collaborating classes are read-only, so there is no way to modify them.  Instead, the client may create a new class extension containing his convenience methods and add it to his pool.  Only clients that include the class extension in its pool will be able to call it.

A class extension object is like a class except it has no instance variables, it points to the base class it is extending, and it points to its extension subclasses along with it extension superclass if any.  It may also import its own pools.  Methods in a class extension may refer to receiver as "self" but it cannot refer to any of its instance variables.  Also, it cannot refer to any of the base class's pool variables, only its own.  In other words, the class extension has no more rights than any other client of the class.

The base class is unaware of any class extensions attached to it, hence the term "detached" class extensions.  This allows anyone to add an extension with just read-only access to a class.  We could allow a special capability to the read-only classes to point to all its class extensions, but this would scale badly.  Besides class extensions are only used by specific clients so it would not make sense for the base class to carry references for specific client usages.

The compiler and method lookup have to be modified a little to support finding class extension methods, since the base class is unaware of them.  When compiling a method each message-send is looked up in the method's class pool looking for the abstract class (interface) that contains the selector being sent and binds to that selector.  Each selector knows the class or class extension it came from (see 3.6 Typed Selectors below).  When a message-send with a selector from a class extension is executed, the class extension and all its subclass extensions are searched in the order of their base class inheritance hierarchy starting from the receiver's class.  When a message-send with a selector from normal class is executed, the method is looked up as usual in the base class hierarchy.

3.5  Subclassing Read-Only Classes

A user may also add a subclass to any class.  If the superclass is read-only then the subclass cannot access the superclass's instance variables or pool.  In other words, the subclass, like the class extension, has no more rights than any other client.  If the superclass is a real (mutable) class then the subclass may access its instance variables and pool.  Note, that the subclass can only access instances variables and pool variables up to the highest real class.  For example, if the superclass's superclass is read-only, that class's instances variables and pool are not visible.

3.6  Typed Selectors:  Binding Messages to Collaborator Interfaces

When writing a method and sending a message to a variable or message result, you intuitively know which abstract class you are targeting, otherwise how would you know the name of the message.  This implicit knowledge is not captured in current Smalltalk.

Variable types, as used in statically-typed languages like Java and C++, do capture this knowledge, but they capture it in the variable instead of the message.  This works, but it requires type casting, parametric types, and union types (or single abstract types that cover all possible actual types of a variable).

SharedSmalltalk captures the abstract class of each message rather than each variable.  This captures the information implied, without the need for type casting, parametric types, etc.  Also, the compiler automatically figures out the abstract class from the message name without requiring the user to enter it.  This requires associating selectors (method names) with their abstract classes, hence the term "typed selectors".  The abstract class of a selector is the class that introduces it.  The set of all selectors that a class introduces is called its interface.  For example, "do:" is in the Collection interface, even though its subclasses override it.  The correct abstract class is inferred for each selector when its method gets added using this introduction rule.  The only time the user has to explicitly name the abstract class of a new method is when the selector is from another class outside the method class's hierarchy and it wants to be polymorphic with it.  If the abstract class is not specified in this ambiguous case it is assumed the selector is new for the method class and not polymorphic with the other class.  In effect, they are different selectors (although they have the same name) and can't be used polymorphically.

When the user is writing a message inside a method the compiler looks up the message name in all class interfaces in the method's class pool.  If the message name is found in a single class interface then that class's selector is bound to the message.  If none are found, then alternatives in the pool that are close in spelling are presented to the user.  Notice, the user is not allowed to send a message selector that is not visible to the method's class, adhering to the principle of least privilege.  Finally, if the message name is found in more than one class interface, then those classes are presented to the user for him to choose.  Since the pool already narrows the potential selectors, this last scenario should be rare.  But if it occurs, once a choice is made for a certain variable the same interface choice will be made automatically for subsequent ambiguous messages sent to the same variable.  Of course, the user may always explicitly state the intended interface by prefixing the message with it, as in "".

In short, typed selectors bind messages to collaborators' interfaces making the code easier to trace and understand, without requiring any extra programming (unlike typed variables).  It also makes detached class extensions possible (as per the last paragraph in section 3.4).

3.7  Nested Transactions:  Layered Alternatives of the World

Every process runs in its own transaction, or shares a transaction with cooperating processes.  Every object read/write is only visible to the process's transaction and its nested transactions.  Upon process completion or explicit commit, changes are committed to the parent transaction, which is the base system if the transaction is not nested.  The commit is aborted if reads/writes conflict with other committed sibling transactions.  An aborted transaction may continue to exist and operate normally, it just can't be committed.  Also, a client may choose to run a process in any open transaction that is available to it, providing multiple perspectives [Us, PerspectiveS].

A transaction can be inspected just like any other object and compared with other transactions.  They can even be edited, merged, or synchronized with parent transactions.  Synchronizing means updating the transaction with data from recently committed transactions, possibly losing some of the changes the transaction has done.  These losses are presented to the client object that requested the synchronization, and if accepted, the transaction is made committable again.

These flexible transactions are equivalent to development layers [PIE] and are useful for managing different version of source code and testing them out.  However, you are still restricted to what you can change because of the capability-based security, i.e. read-only classes.  If you want to make dramatic changes you have to make copies and change them.

4  Conclusion:  Compromise for Reliability

Smalltalk is great because of its extensibility at all levels.  But in order to share Smalltalk with others we have to give up some, but not all, of this autonomy.  In particular, we can still add benign personal extensions.  This goes a long way since many class extensions are benign.

Limiting changeability to benign extensions also forces us make changes without breaking existing code.  Dramatic changes have to be developed as copies that can coexist with the existing mechanism.  When the copy is finally working, pointers that you have capabilities to, can be switched to the copy.  If something breaks the pointers can be switched back.  Alternative versions that can reside together allows incremental migration of client code.

5  Related Work

Distributed Smalltalk [DST] and Remote Smalltalk [RST] allow remote method invocation between Smalltalks but does not use the caller's class version nor class extensions.  [Squeak-E] adds secure distributed processing and eventual sends a la [E] to Squeak Smalltalk, but like [DST] and [RST] does not adopt the caller's class version nor class extensions.  [Islands] restricts certain processes using capabilities, but allows arbitrary class changes inside projects or does not allow changes or extension of classes at all (not sure which).

Envy/Developer [Envy] version control system (now part of IBM Smalltalk [VisualAge]) has class extensions that get merged into their base class when loaded.  A class extension may access instance variables and override methods and is therefore not benign.  VisualWorks [StORE] has the same problem.  Packages in Envy and StORE are similar to pools, except pools are more flexible as explained at the bottom of section 3.2.  Also, pools and benign class extensions are runtime objects, not separate development time objects that need to be loaded.

[SmallInterfaces] adds interfaces to Smalltalk but does not bind them to messages or classes, like typed selectors (section 3.6).

6  Future Work:  Transaction Security, Multiple Inheritance, and Distribution

Transactions have to be researched more to allow meaningful inspection and manipulation (editing, merging, synchronizing, etc) without violating object security.

Class extensions may generalize well to multiple inheritance, where a class extension is like a new abstract superclass.  Typed selectors would narrow down the search path and prevent ambiguities.  Instance variables would act like private accessor methods so they can be overridden, alleviating the "diamond problem" in inheritance.  See [MI] for more.

Distribution involving smart replication and migration, along with persistence has to be addressed.  [Squeak-E] is a good starting point because of its secure communication.

The SharedSmalltalk design presented in this paper is still just a design.  An implementation still has to be built.

7  Notes

[1]  Diffs of versions can be migrated on demand.  Class versions that are different can be swapped in and out of class holders that instances point to.  The method cache would be different for each environment.

[2]  Since the replicated bits will be on the user's machine, he could try to change them, but he won't be able to change them from Smalltalk.  Even if he does go to the trouble of changing the bits he can't affect other machines.

8  References

[E]  Open Source Distributed Capabilities.

[St-80]  Squeak, the Open-Source Smalltalk.

[OMG]  Common Object Request Broker Architecture.

[Java]  Java Remote Method Invocation.

[Gem]  Gemstone/S v6.0.  Gemstone/S Programming Guide, December 2001

[CDP]  Norman Hardy.  The Confused Deputy Problem.  Operating Systems Review, October 1988.

[Cap]  E. C. Van Horn.  Programming Semantics for Multiprogrammed Computations.  Communications of the ACM, vol 9, p 143-154, March 1966.

[POLP]  Jerome Saltzer and Michael Schroeder. The Protection of Information in Computer Systems.  Proceedings of the IEEE, 63(9), September 1975.

[IFlow]  Andrei Sabelfeld and Andrew C. Myers.  Language-Base Information-Flow Security.  IEEE Journal on Selected Areas in Communications, special issue on Design and Analysis Techniques for Security Assurance, 21(1), January 2003.

[Globe]  A. Bakker, E. Amade, G. Ballintijn, I. Kuz, P. Verkaik, I. van der Wijk, M. van Steen, and A. Tanenbaum. The Globe Distribution Network.  In Proc. 2000 USENIX Annual Conference (FREENIX Track), pp. 141--152, San Diego, CA, USA, June 2000.

[WebOS]  A. Vahdat, T. Anderson, M. Dahlin, E. Belani, D. Culler, P. Eastham, and C. Yoshikawa. WebOS: Operating System Services For Wide Area Applications. In Proceedings of the Seventh Symposium on High Performance Distributed Computing, July 1998.

[Thor]  Thor: A Distributed Object-Oriented Database System.



[Us]  Randall B. Smith and David Ungar. A Simple and Unifying Approach to Subjective Objects. Theory and Practice of Object Systems 2(3):161-178, 1996.

[PerspectiveS]  Robert Hirschfeld and Matthias Wagner.  PerspectiveS - AspectS with Context.  2002.

[PIE]  Ira P. Goldstein and Daniel G. Bobrow.  A Layered Approach to Software Design.   Xerox Corporation, CSL-80-5, December 1980.

[DST]  John K. Bennett.  Distributed Smalltalk: Inheritance and Reactiveness in Distributed Systems.  PhD Dissertation, University of Washington, 1988.

[RST]  Diego Gomez Deck.  rST - Remote Smalltalk.  2002.

[Islands]  Lex Spoon.  Objects as Capabilities in Squeak.  August 2000

[Squeak-E]  Rob Withers.  Squeak-E

[Envy]  Vikas Malik.  Envy/Manager FAQ.  1994.


[StORE]  Cincom.  Team Development with VisualWorks.  October 2000.

[SmallInterfaces]  Benny Sadeh and Stéphane Ducasse.  Adding Dynamic Interfaces to Smalltalk.  Journal of Object Technology, vol. 1, no. 1, May-June 2002, pages 63-79.

[MI]  Anthony Hannan.  Stateless Multiple Inheritance with Typed Selectors.  2002.

Lex Spoon.  Capabilities: a Possible Security Model for Distributed Computing.  May 1999.