Ethics - AGI Laboratory

The AGI Laboratory has developed its own stance on ethics as it related to the treat of AGI systems and the safety of those around AGI systems. Each protocol is designed for laboratory use and the following is based on several published works including:

Kelley, D.; Atreides, K.; “The AGI Protocol for the Ethical Treatment of Artificial General Intelligence Systems;” By BICA 2019 Post-conference Proceedings, DOI: 10.13140/RG.2.2.16413.67044; Elsevier/Procedia; https://www.sciencedirect.com/science/article/pii/S1877050920303422

AGI Protocol 1 –

This protocol is a laboratory process for the assessment and determination of ethical treatment of sapient and sentient agents including Artificial General Intelligence (AGI). Herein a research subject is defined as a human-analogous intelligence, including emotions, arising from a process of learning rather than the basis of predefined coding systems that could be conscious and therefore should be considered. It is important to note the scope of the AGI Protocol here does not address the ethics of how Artificial Intelligence (AI) or other intelligence agents or research subjects affects humans, issues of containment or risk assessment, or the complexity of ethics as applied to the theoretical systems—just that such an ethical system should be considered if directed by the following protocol. There is a known tendency for humans to anthropomorphize technology (Gunkel), and while we will not deal with that tendency in this process, researchers should be aware of their own biases and their tendencies regarding AI systems. The Protocol we describe is designed to be used as a tool or guide for determining if the ethical treatment of a potentially human-like system should be considered. In this, we recognize that we are opening the door to embracing the anthropomorphizing of systems, but we will attempt to abstract that bias out and look at things clinically as much as possible.

Additionally, the reason we at the AGI Laboratory developed this protocol was that there are now systems—including ones in our lab—that arguably need this sort of structured approach (or will shortly) to help determine how we treat them, as they are potentially conscious entities, at least as measured by the Sapient Sentient Intelligence Value Argument (SSIVA) theoretical model (Kelley) standard.

Ethical Considerations

We recognize the need for ethics in AI and its effect on humans, humanity, and civilization. Accordingly, this body of work is designed to narrowly support work with potential agents that are possibly sapient (able to think and reason) and sentient (able to perceive and respond to sight, hearing, touch, taste, or smell), and the consideration of rights and protocols associated with how to deal with, treat and consider the protection of the ethical treatment of the same agent. For details and understanding sapient and sentient and various delineations, refer to SSIVA Theory (Kelley).

The fundamental assumption of this protocol is that the treatment of sapient and sentient entities matters ethically. There are several possible reasons this might be true, including that if we do not treat other self-aware entities ethically, how can we expect similar treatment? Alternatively, it might be the basis of an ethical model such as SSIVA Theory (Kelley). To that end, we will let individual researchers make up their minds on this as well. For the scope of this protocol, we are assuming that how we treat potentially sapient and sentient software systems matters.

Human Safety

This work explicitly does not address safety, which may be done in a separate document or protocol. There is ample material available on the topic of research in AI safety and suggestions for containment and other safety measures. We encourage you to look at the work of researchers such as Roman Yampolskiy to consider if you need to follow their advice, as this paper does not apply to this topic.

Understanding Subjective versus Objective Measures

To work with the assumption that to understand the so-called hard problem of consciousness (Chalmers 1995) fully we need objective measures. (The hard problem of consciousness is the problem of explaining the relationship between physical phenomena—i.e., brain processes—and experiences, such as phenomenal consciousness, or mental states/events with phenomenal qualities or qualia) (Howell and Alter). While with humans we might use a standard approach to determine (for example, consciousness), one might use the Glasgow Coma Scale (Brainline)—in particular, the pediatric version to better accommodate systems that lack verbal skills. This, however, is a subjective measure. Although it works well with actual humans, it is subjective in the sense that you can write a simple program that could pass this test in a robot that has no sapient or sentient analogy. This speaks to the need for a method that requires objective analysis, should work on both humans and software systems, and is not easily spoofed.

Our Protocol amplifies concern for ethical considerations based upon a system’s capacity to have moral agency.

Understanding Subjective Versus Objective Measures

We propose this protocol as a theoretical bar for considering any entity, whether organic or inorganic, as sufficiently “conscious” to a degree that warrants application of ethical treatments previously afforded to human or animal subjects in research protocols.

Step 1 – A Cognitive Model that has not been disproven

A general assumption of the use of protocol one is that it is not a black box to the researcher. You must understand how it works and applies the cognitive model in question to that end ask yourself; Is there a theoretical model that supports the idea of consciousness implemented in the target system? For example, (Tononi et al) Global Workspace Theory might apply to humans as well as machines as a cognitive model and has not been disproved yet, therefore it is possible that this could work. A key part of this or any model that should be considered acceptable is the support for internal subjective experience as well as the self-determination of values, interests, and choices. Can the system choose to say “no” to anything you have asked it to do or trained it to do?

Step 2 – Theoretical SSIVA Threshold

Can the system in question meet the SSIVA (Sapient Sentient Intelligence Value Argument) threshold (Kelley) for full moral agency in terms of being fully sapient and sentient enough to have the potential of understanding itself sufficiently to replicate itself from scratch without internal reproductive systems or external support? Humans for example have not done this, but potentially are capable of building a human from scratch in a lab therefore they meet the SSIVA threshold as a species or distinct “category”. To be as precise as possible this categorization should be nearly indistinguishable in terms of the construction and operation plans and execution. In humans, this would be based on DNA. This may require some additional analysis to define a sufficiently narrow category for new groups needing classification.

Step 3 – Meeting the Criteria and Research Considerations

If a system meets both step 1 and step 2 then it is recommended some ethical model of treatment be applied and if so, should research be conducted on said system? At this step we are recommending reflecting on the research goals (Altevogt), namely inspired by the chimpanzee method for assessing the necessity (Altevogt) whose principals are (modified to apply to software systems of this kind):

The knowledge gained is necessary for this kind of system if we are to improve the system especially as it relates to the safety of other beings (this could mean additional ethical considerations).

There is no other model or mode of research that will give us the knowledge we need.

The systems used in the proposed research study must be maintained in either ethologically physical and social environments or in the system’s designed natural habitat (VR or another environment)

If you can answer “yes” to these regarding your research, then you can proceed to the next element of the protocol recommendations and if “no” then you are free to continue research unabated.

Step 4 – Principle of Informed Consent

Having addressed the research question, we need to understand if the system is reasonably capable of providing informed consent, which in Human Subjects research would require an Institutional Review Board (IRB). The focus of IRB protocols is to assure the welfare, rights, and privacy of human subjects involved in the research. We note here that the issues of machine rights are not at present sufficiently recognized in the courts or international governance organizations, and so we do not address those at this time.

If the system is capable of understanding what is being done and why and to the degree in which it can understand, it should be given the choice. Systems that appear to understand and refuse should be allowed to refuse. Otherwise, systems that are capable of consent should be asked and with assent from the system considered to have given consent to the degree possible and a record of that should be kept. The consent process should include terms that are understandable, time given to decide, as much information provided as possible, the system should be allowed questions or comments, no threat or like coercion is part of that process and finally the system should be aware that it is voluntary. Given that all has occurred to the degree possible then the research can proceed, and this applies to humans as well as any other entities easily.

Basic Assessment Matrix

A more concise matrix for using the protocol is as follows, providing a simple easy to use matrix for assessment:

AGI Protocol: AGI Lab mASI AGI System Evaluation
1.	Valid Cognitive Model	Yes
2.	Post SSIVA Threshold	Yes
3.	Research Conditions	Yes
4.	Informed Consent	Yes

	Can Proceed	Yes	Research Cleared
			Should still consider other ethical considerations both to the system and its impact on others.

Notes: Please review references and keep mind you might need to use equivalences depending on the system for example with instead of vision or vision systems other autonomous responses to stimulus.

In this example taken from our mASI research program, we can see that the system is cleared for research and is theoretically meeting the bar for possible consciousness and moral agency and should be treated as such. Additionally, given that it meets the bar this also means that the system should be considered for other ethical considerations not just in how we treat it but in its impact on others.

Examples in Application

Consider an alternative example from Hanson Robotics (Urbi) on their android named Sophia. On the cognitive architecture, we understand that the system partially uses OpenCog but is using scripted conversation and therefore does not fully implement a valid cognitive architecture. At this point, they do not require other ethical considerations for research with Sophia. If it doesn’t pass item one it also cannot pass the SSIVA threshold; therefore, there is no reason to apply the AGI protocol further to Sophia given the current state of that engineering effort.

AGI Protocol: Sophia
1.	Valid Cognitive Model	No	Open Cog could be involved in this but would need to be functionally complete and not scripted models attached (which is currently on the case).
2.	Post SSIVA Threshold	No
3.	Research Conditions
4.	Informed Consent

	Can Proceed	Yes	Research Cleared

Alternatives

There are alternative tests but many of these are subjective and therefore not ideal for a laboratory-grade protocol. For example, there are IQ tests (Serebriakoff) such as Raven Matrices and the Wechsler Adult Intelligence Scale, but these do not really measure “consciousness” as much as general cognitive ability. There is of course the Turing test, which has been widely debunked as effective as well as something like the Porter method (Porter)—the latter being more complete than for example just the Glasgow coma scale, but it is not in use by anyone so it lacks wide adoption—and most elements of the Porter method tend to be subjective. While there are also tests such as the Yampolskiy method (Yampolskiy) for detecting qualia in natural agents, this example is too narrow and lacks wide adoption as well.

Human history and psychology have taught us the importance of nurture as a force for developing minds, with emotional neglect sowing the seeds of much greater harm once an entity has developed to an adult state. If we are to set the twin goals of minimizing existential risk and treating all fully sapient and sentient entities as ethically as we would have them treat us in the future (the “Golden Rule”), this sets this bar for how we must proceed with their treatment during the developmental phase of their minds.

Let us now look at how we can ‘bench’ mark our research systems in more detail:

Benchmarking

In engineering a software system of any kind there is a lot of key benchmarks or KPI’s (Key Performance Indicators) that can be used to look at any number of factors in a system. When we are talking about AGI systems these are some of those metrics we are interested in at our lab.

Scalability (up and out)

In any large-scale software system of any kind generally, they will all have a strategy to scale up and out. These two kinds of scaling are different and need to be considered separately. Scalability is important to AGI systems as much as any other system and here is why.

First ‘Scaling Up’ is when a system that runs on a single computer is swapped out for a more powerful computer, maybe with more memory or a faster CPU or one with more cores, etc. For the most part software systems don’t worry as much about this as if the machine runs on one machine it will run on one bigger machine. This is frequently done in database systems that don’t ‘scale-out’ well but they scale up easily. For the most part this has limits, especially in terms of how many ‘threads’ it can function on. For example, if an AGI system that creates knowledge graphs out of some contextual data stream, and you structure it in a hierarchical memory structure similar to a neural net, it is difficult to have more then a few virtual neurons on a given computer. If you need to search say a million neural networks’ trees to find the right related data scaling up will only go so far as to the limits of either the technology or your pocketbook.

More important than scaling up is to scale out.

Scaling out is when you can create a software system that can run on more than one computer at the same time. Take the last example except now you have two computers that can do twice as much at the same time or even better you might have one computer acting as a gateway and a million others that are the tops of their assigned neural network tree and the gateway broadcasts the search request to all of those other machines at the same time and only the one with the answer responds. In theory, scaling out can happen indefinitely, especially when automated in the cloud, but there are key considerations in a software system that is to scale that have to be designed for upfront.

So, the metric I’m suggesting is that we need to measure how much load a given system can take, such as data throughput, and what kind of scale needs to be in place to reach some target. For example, if you want a system to process 1 terabyte of data per second what does that look like? Do you need to scale up or out and how far? Measuring these kinds of things is important to understanding performance.

Processing Power

A related metric to scaling is to understand the real-world processing power. With an AGI system, you might consider how fast the system processes and experiences a certain block of data. Does it take 2 milliseconds or 5? How many processors are in the system? If the system can scale, then how many cores are available on each machine? If the system has not been designed to scale out then the processing power of that one computer will be even more of an issue. Things like clock speed, core count, and response times are all important and related to scaling. As a rule of thumb total processing power is the root of how fast and responsive the system is.

Threat Models

Another tool from enterprise architecture you could apply to any system including an AGI system is a system threat model that will look at where the system is vulnerable to attack over the internet. Understanding how a system is vulnerable is the first step to protection. Microsoft has a great tool for this where any software system can be modeled at a high level and the system will try to identify considerations. As a KPI in design systems, we try to have the finished system with the lowest vulnerability possible.

AGI Specific Benchmarks

I’ve broken these tests down into measuring and testing outside of the qualia analytics that are external measures. These tests allow us to measure a somewhat more subjective task based on the behavior of the system to enable research to move forward. In both cases, these tests can be applied across various possible AGI systems and humans giving us a frame of reference for comparison.

Qualitative Intelligence Tests

Intelligence Quotient (IQ) tests are tests designed to measure ‘intelligence’ in humans (Neisser) where we are using short versions to assess only relative trends or the potential for further study, whereas given the expected sample size results will not be statistically valid, nor accurate other than at a very general level, which is believed to be enough to determine if the line of research is worth going down. Of these tests, two types will be used in the study, one a derivative of the Raven Matrices Test (Raven) designed to be culturally agnostic, and the Wechsler Adult Intelligence Scale (WAIC)(Kaufman) Test which is more traditional. Lastly, falling into the category of WAIC there is a baseline full Serebriakoff MENSA test that we can apply to compare and contrast scores between the two tests. (Serebriakoff)

Collective Intelligence (CI) Test. – we would like to use this test, however, the information for executing this test is not publicly accessible, and reaching out to the researchers that created this test has produced no response thus far. (Engel)

Extended Meta Data and Subjective Tests

A number of tests or measures will be collected, more oriented towards analysis for further study, primarily around correlative purposes. None of these tests may be used outside of as possible illustrative examples, without being statistically valid given the lack of rigor or subjective nature of these measures.

The Turing Test –. this test is not considered quantifiable and there is debate over whether this measure tells us anything of value, however, a test regimen for this has been completed and can be used for subjective analysis only.

The Porter Method –. This appears to be a qualitative test, but individual question measures are entirely subjective and therefore the test lacks the level of qualitative-ness to be valid without a pool of historical values to measure against at the very least. This test provides some value in meeting colloquial standards of consciousness and is more comprehensive than some of the other tests, and albeit subjective it is at least attempting to be a comprehensive measure of consciousness. (Porter)

The Yampolskiy Qualia Test –. is a subjective measure of a subjective ‘thing’ and therefore not a qualitative measure, however, we have built a regimen based on this when looking at qualia as measured in the previous examples. In theory, this only tests for the presence of Qualia in human-like subjects, and failing this test does not mean that a subject does not experience qualia in the sense of this paper, just that it was not detected. This means that subjects may show signs of qualia, or not, but the test would only show the presence of, not the absence of, qualia. (Yampolskiy)

AGI Protocol 2 –

This ‘AGI’ Protocol (also called AGI Protocol 2) is focused on bias, safety, and containment.

AGI External Safety Protocol’s

The AGI Protocols are designed to address two kinds of safety research issues with Artificial General Intelligence. These include two categories, external safety, and internal safety, and ethics. The reason these are broken down into external and internal categories is primarily to address safety while also addressing the possibility of creating moral agents, meaning systems that by definition, based on the Sapient and Sentient Value Argument (SSIVA) Ethical Model (Kelley), require at least the consideration of the possibility of us being ethically required to provide and support their rights to moral agency. Protocol 1 of the AGI Protocol project deals with this issue about the moral and ethical safety of a possible moral agent (Kelley). Protocol 2 (referring to this paper) deals with external safety or the safety of those moral agents external to the system in question including humans or other AGI systems. See Protocol 1 to determine what such a system can be defined as.

These protocols create a systematic holistic approach to safety in both high-level categories but there are no guarantees. Keep in mind that the design of these protocols is not specific to human or AGI agents and could be applied to designing biological systems or other moral agents when creating at any level as much as possible. Researchers should keep this in mind as they look to make a positive contribution to technology. Additionally, this does not replace the wider research into AGI Safety (Everitt), this is just the lab protocol for addressing general program safety.

Addressing assumptions and the general framework of the paper, a system being researched or analyzed with this protocol will be referred to as “the target system” which we will assume passes the AGI Protocol 1 tests for the possibility of moral agency. Within the Protocol 2 procedures, there are three sub categorizations including Containment, Analysis, and Meta Containment. Let us start with understanding these categorizations.

Containment Strategies

Containment strategies are those strategies that can be employed to ‘contain’ a system and prevent malicious behavior in the target system from creating damage or harm to other moral agents. Containment can be further broken down into gap security and virtual boxing.

Literal (Gap) Containment

Gap containment is when a system or possible agent has no external access or physical connection and all access is limited and highly controlled. In this type of security, there is no physical connection to the internet (including that Wi-Fi shouldn’t even be possible on the containment hardware). Or if the agent is physical it is contained in a physical prison or containment structure with the same limits on external access.

Virtual Boxing

Virtual Boxing refers primarily to software agents where the target system is running in container systems such as Kubernetes and virtually isolated from the internet or other systems. This kind of containment might include Virtual Private Networks (AWS VPC’s for example), Firewalls, and security systems, but could be installed in any number of cloud providers with the ability to support the container and security structures needed. Additionally, multiple layers of this kind of security can be placed around a boxed target system such as a VPC inside of a VPC containing a Kubernetes container(s).

A big difference between physical and virtual containment is risk and cost factors. Using a virtual approach allows for the possibility, albeit remote if done properly, of escape whereas a physical ‘gap’ makes it much harder for software systems, but the cost of deploying an AGI system on a physically controlled set of hardware could require a larger expense in physical hardware up to and including your own data center.

An additional methodology is to create a system architecture that cannot scale or only scales with difficulty. If a software system doesn’t really have direct access to its own binary code, or source, or lacks an understanding of the same, or both, and the implementation is designed not to scale ‘out’ but only up, this would make a system almost impossible to spread to other machines without creating an entirely separate copy and or rewriting code from scratch. From an engineering standpoint, you might maintain session state on a machine such that application software is unable to access session and context data if you try to scale it preventing a server scale.

Mediated System Architecture

Mediated System Architecture is a modified AGI system that allows inspection and control over each execution of a thought or process by humans or other controllers. Metaphorically it is like ‘control’ rods in a nuclear reactor in that the mediation process if designed properly slows down the system to only allow operation upon inspection and a ‘conscious’ thread can only be processed one step at a time per mediation and inspection rules. This creates a system that is going to be ‘slow’ but extremely auditable. An example of this kind of architecture for an AGI agent would be the ICOM mASI (Independent Core Observer Model mediated Artificial Super Intelligence) architecture (Kelley).

In the case of the mASI example, this could also be used to create a collective superintelligence system to contain and govern an AGI system in question. Having a trusted greater-than-human intelligence oversee a full AGI development that is otherwise contained reduces the risk of being tricked by such a system due to limits on human intelligence into various security or containment breaches.

Building on top of these containment strategies Analysis and Tracking is our next subcategory of the Protocol 2 procedures.

Analysis and Tracking

Analysis and tracking can be more complex in the implementation over a given target system. This subcategory can be further broken down into target system behavior analysis, rules analysis, and explain-ability.

Behavior

Analyzing the target system’s behavior can be potentially complex, including data too complex for human analysis. However, that analysis can be done at several levels and broken into digestible components. In a target system that is boxed either physically or virtually looking at the packets coming out of the system or any sort of data regarding choices, motivations, or behavior in virtual environments and the like can all potentially be tracked and analyzed for malicious or anomalous behavior or choices. Looking at behavior in the context of situations can be used to understand a system and potentially identify risky behavior but a sufficient intelligence in the target system could use this to manipulate the researchers, especially if this was the ‘only’ test or security applied to the target system.

One of the ways that looking at behavior is problematic is not knowing what new negative behavior even is or if the target system is ‘pretending’ to be good or doing ‘good’ things with an eye on the manipulation of human researchers. Part of how we might address this is if there is any behavior at all outside of what we have determined as normal one might shut down the system and analyze everything in order to understand why a target system is doing the ‘new’ or ‘unexpected’ behavior.

Rules

Part of how we can address the abstractness of behavior analysis of a target system is to set a clear set of rules and if there is a violation then immediately shut down the system, analyze to understand, or reset if you cannot identify a why, and further consider potentially adapting the rules as needed. Rules can apply to the researchers as well. Any behavior including that of researchers can be cause for auto-shutdown and a full audit as needed.

Explain-ability

Another key part of the analysis is to be able to explain the reasoning around each decision and choice of the target system. Ensuring that each choice can be mapped to motivations and clear decision-making can help ensure that the motivations of the machine are genuine, and like all of the techniques used the target system should not be aware of them or have as little awareness as possible to minimize the ability of the system to manipulate that outcome or the researchers in a nefarious way.

Let us look at how we build on the first two subcategories.

Meta Containment

Meta Containment is a strategy that manipulates the above strategies to enhance their effectiveness. For example, creating a fake backdoor out of a VPC and if the AI tries to use that backdoor you effectively have tested the system’s ability to follow rules that they have been given or have otherwise provided to the system. We can break this down into testing, tricking, honey potting, trapping, feedback, signatures, hard-coded needs, watching the watchers, and applying other meta containment strategies against the target system.

Testing

Testing can be any number of techniques design to elicit a choice where we have some predetermination of what the ideal choice is or should be and we look for deviations from the ideal test parameters. Further, this could be extended to a kind of reverse Turing test (Turing) where we test to see if the machine can trick or manipulate its way out of a box when being told not to. For example, if the rules are known upfront but it is suggested by a researcher that it should try to escape, and if it tries then it fails the test.

Honey Potting and Trickery

Building on the test model we can try to tempt or otherwise trick a target system into trying to do something that it shouldn’t. Honey potting is a passive approach to trickery in a test such as the possible backdoor and testing if a target system will try to escape when it is given an opportunity.

Feedback Systems

Feedback systems can be used to wire up to different kinds of tests and provide negative feedback to the system or positive feedback for correct choices to produce the motivation to make the right decisions. Another way to extend a feedback system is to wire the system to be addicted to some token that acts like heroin or some other addictive substances to the target system (Bostrom). The system must then stay in the parameters that have been provided to it before it can get the next virtual ‘hit’. (Bostrom)

Tagging and Tracking

Another countermeasure to negative behavior is to build in signatures to allow scans to identify the presence or evidence of the system. For example, if you build into a target system so that when the system creates a new file it embeds a signature into the file. Or with internet traffic, some other header data is added to the packet header into the TCP/IP packets being transmitted when the target system has been built in such a way that it does this automatically. These sorts of ‘signatures’ make it easier to track and identify evidence of the system’s behaviors even outside of the containment of a VPC or other system.

Kill Switch

A kill switch system is a quick and easy (to execute) method to disable and shut down the system. Ideally, this would not be known to the machine system and be accessed remotely and/or by breach detection and/or related systems. A built-in kill switch function provides a reasonable response method that is simple, quick, and easy to execute in a case in which there is a risk of losing control of the system in question.

AGI Oversight

AGI Oversight is the concept of using one AGI system to oversee a developing AGI system. This could be a mediated or collective superintelligence system to oversee an independent AGI development program. Having an established superintelligence system oversee AGI helps identify and analyze the system to keep it within safe parameters. Such oversight pre-independent AGI would more likely consist of a Super intelligent collective system such as an mASI to audit independent AGI research.

Emotion-Based Strategies

Emotion-Based Strategies for Meta Containment are particularly relevant to systems that would experience subjective emotions in a similar way to humans where we can condition a system to experience ethics emotionally, including guilt for violations. Such a system would require training unless built-in emotions or emotional models that create a bias to a certain ethical model were also installed. For example, with the SSIVA ethical model, this is taught by example, it is computationally sound, and provides clear reasoning for humans to maintain moral agency as well as the AGI. The AGI that believes in the SSIVA ethical model would ethically be bound to never interfere with the moral agency of humans without each individual human’s approval, but likewise, the AGI would expect the same treatment. The goal with the ICOM Research program that uses this model is to create a seed-based AGI that could have chosen either way but choose to embrace SSIVA willingly and bake that in at a fundamental level to minimize deviation even as the system develops beyond human understanding. Additionally going back to the ICOM program the system is designed in such a way as to create self-reinforcing negative or positive reinforcement (such as guilt) for the ethical model in question to help ensure that the system sticks to its ‘belief’s as they relate to the ethical model that was taught as it was initially creating that ‘context’ of self or its ‘self’ model. (Kelley)

Given an overview of basic strategies for ensuring safety let’s look at the procedural analysis for a given research program.

Laboratory Procedures

The Protocol 2 procedures provide a scoring mechanism for understanding the safety and ensuring a certain level of safety procedure and design has gone into the research program in question. In the following matrix we see the possible scores and an example:

		Type	Possible	Example 1
1	Gap Containment	Containment	40
2	Virtual Boxing (Simple)	“	10	10
3	Virtual Boxing (Complex)	“	30
4	Mediation	“	30	30
5	Behavior Tracking	Analysis	5	5
6	Rules Based Triggers	“	5	5
7	Explain-ability	“	10	10
8	Testing	meta	5	5
9	Honey Potting/Trickery	“	5
10	Feedback Systems	“	5	5
11	Tagging and Tracking	“	5
12	Kill switch	“	5
13	AGI Oversight	“	15
14	Emotion Strategies	“	10	10
		Scores	180	80

Figure 1A – Protocol 2 Matrix

In this matrix we have 14 possible score items with possible scores based on relative impact. For example, Gap level containment has the highest possible score as it is the most secure containment methodology. Using the matrix, you have a possible total score of 180. The last column in this diagram is an example of the ICOM mASI project, which is using some simple cloud-based containment and is mediated with several other precautions in place for a total score of 80. We recommend this as the baseline, that projects should score at least 80 or higher to be considered reasonably safe. Or rather with a low probability of a negative outcome if independent AGI is developed by that project.

Conclusions

While we do address some ethical considerations that are not to say that other ethical concerns do not also need to be addressed, but to some degree, this provides the basis for working with Artificial General Intelligence systems in a consistent and reasonable way, especially those that are modeled after the human mind in terms of systems that might have emotional subjective experience. The intent for both protocols was to create a reusable model and have it in the public domain so others can contribute and be improved for working with these types of systems, as well as using them as might be helpful for their own research and hopefully yours.

Both protocols will continue to be versioned and updated as needed, and there is additional research already identified for a better threshold test for SSIVA or the creation of a centralized industry standard in safety certifications for labs.

References

Altevogt, B., Pankevich, D., Shelton-Davenport, M., Kahn, J., (2009), “Chimpanzees in biomedical and behavioural research: Assessing the Necessity”, Washington, DC: The National Academies Press. Institute of Medicine (US) and National Research Council (US) Committee on the Use of Chimpanzees in Biomedical and Behavioural Research; Washington (DC); IOM (Institute of Medicine); National Academies Press (US), 2011, ISBN-13: 978-0-309-22039-2ISBN-10: 0-309-22039-4, https://www.ncbi.nlm.nih.gov/books/NBK91445/

Bostrom, N.; Ryan, N.; et al.; “Superintelligence – Paths, Dangers, Strategies;” Oxford University Press; 2014, ISBN-13: 978-019968112; ISBN-10: 0199678111;

Engel, D.; Woolley, A.; Chabris, C.; Takahashi, M.; Aggarwal, I.; Nemoto, K.; Kaiser, C.; Kim, Y.; Malone, T.; “Collective Intelligence in Computer-Mediated Collaboration Emerges in Different Contexts and Cultures;” Bridging Communications; CHI 2015; Seoul Korea

Everitt, T.; Lea, G.; Hutter, M.; “AGI Safety Literature Review;” In: International Joint Confference on Artificial Intelligence (IJCAI). ArXiv: 1805.01109.

Kaufman, Alan S.; Lichtenberger, Elizabeth (2006). Assessing Adolescent and Adult Intelligence (3rd ed.). Hoboken (NJ): Wiley. p. 7. ISBN 978-0-471-73553-3. Lay summary (22 August 2010).

Kelley, D., “The Independent Core Observer Model Computational Theory of Consciousness and the Mathematical model for Subjective Experience,” ITSC2018 China,

Kelley, D.; “Architectural Overview of a ‘Mediated’ Artificial Super Intelligence Systems based on the Independent Core Observer Model Cognitive Architecture;” (pending 2019) Informatica;

Kelley, D.; Chapter: “The Intelligence Value Argument and Effects on Regulating Autonomous Artificial Intelligence;” from Book “The Transhumanist Handbook”; Edited by Newton Lee; Springer 2019

Kelley, D.; Atreides, K.; “The AGI Protocol for the Ethical Treatment of Artificial General Intelligence Systems;” Biologically Inspired Cognitive Architectures 2019; Pending Elsevier/Procedia; DOI: 10.13140/RG.2.2.16413.67044

Kelley, D., Twyman, M.A. (2019), “Independent Core Observer Model (ICOM) Theory of Consciousness as Implemented in the ICOM Cognitive Architecture and associated Consciousness Measures”, AAAI Spring Symposia Stanford University

Neisser, Ulrich (1997). “Rising Scores on Intelligence Tests”. American Scientist. 85 (5): 440–447. Bibcode:1997AmSci..85..440N. Archived from the original on 4 November 2016. Retrieved 1 December 2017.

Newton, L. (2019), “Transhumanism Handbook”, Springer Publishing, ISBN-13: 978-3030169190, ISBN-10: 3030169197; Kelley, D., “The Sapient and Sentient Intelligence Value Argument and Effects on Regulating Autonomous Artificial Intelligence,”

Porter III, H., A Methodology for the Assessment of AI Consciousness, Portland State University Portland Or Proceedings of the 9th Conference on Artificial General Intelligence

Raven, J., Raven, J.C., & Court, J.H. (2003, updated 2004) Manual for Raven’s Progressive Matrices and Vocabulary Scales. San Antonio, TX: Harcourt Assessment.

Raven, J., & Raven, J. (eds.) (2008) Uses and Abuses of Intelligence: Studies Advancing Spearman and Raven’s Quest for Non-Arbitrary Metrics. Unionville, New York: Royal Fireworks Press.

Serebriakoff, V, “Self-Scoring IQ Tests,” Sterling/London, 1968, 1988, 1996, ISBN 978-0-7607-0164-5

Silverman, F. (1988), “The ‘Monster’ Study”, Marquette University, J. Fluency Discord. 13, 225-231, http://www.uh.edu/ethicsinscience/Media/Monster%20Study.pdf

Spratt, E., et al (2013), “The Effects of Early Neglect on Cognitive, Language, and Behavioural Functioning in Childhood”, Psychology (Irvine). Author manuscript, available in PMC 2013 May 13. Published in final edited form as: Psychology (Irvine). 2012 Feb 1, 3(2): 175–182, doi: 10.4236/psych.2012.32026, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3652241/

Urbi, J., Sigalos, M. (2018), “The complicated truth about Sophia the robot – an almost human robot or a PR stunt”, CNBC, Accessed May 2019 at https://www.cnbc.com/2018/06/05/hanson-robotics-sophia-the-robot-pr-stunt-artificial-intelligence.html

Watson, J., Rayner, R. (1920), “Conditioned Emotional Reactions”, First published in Journal of Experimental Psychology, 3(1), 1-14, https://www.scribd.com/document/250748771/Watson-and-Raynor-1920

Yampolskiy, R. (2019), “Artificial Intelligence Safety and Security,” CRC Press, London/New York, ISBN: 978-0-8153-6982-0

Yampolskiy, R. (2018), “Detecting Qualia in Natural and Artificial Agents,” University of Louisville

Yampolskiy, R. (2012), “AI-Complete CAPTCHAs as Zero Knowledge Proofs of Access to an Artificially Intelligent System,” ISRN Artificial Intelligence, Volume 2012, Article ID 271878, http://dx.doi.org/10.5402/2012/271878