ABSTRACT Title of Thesis: DEVELOPMENT OF LOW-COST AUTONOMOUS RESEARCH SYSTEMS Logan M. Saar, Master of Science, 2023 Thesis directed by: Professor Ichiro Takeuchi, Materials Science and Engineering A central challenge of materials discovery for improved technologies arises from the increasing compositional, processing, and structural complexity involved when synthesizing hitherto unexplored material systems. Traditional Edisonian and combinatorial high-throughput methods have not been able to keep up with the exponential growth in potential materials and relevant property metrics. Autonomously operated Self-Driving Labs (SDLs) - guided by the optimal experiment design sub-field of machine learning, known as active learning - have arisen as promising candidates for intelligently searching these high-dimensional search spaces. In the fields of biology, pharmacology, and chemistry, these SDLs have allowed for expedited experimental discovery of new drugs, catalysts, and more. However, in material science, highly specialized workflows and bespoke robotics have limited the impact of SDLs and contributed to their exorbitant costs. In order to equip the next generation workforce of scientists and advanced manufacturers with the skills needed to coexist with, improve, and understand the benefits and limitations of these autonomous systems, a low-cost and modular SDL must be available to them. This thesis describes the development of such a system and its implementation in an undergraduate and graduate machine learning for materials science course. The low-cost SDL system developed is shown to be affordable for primary through graduate level adoption, and provides a hands-on method for simultaneously teaching active learning, robotics, measurement science, programming, and teamwork: all necessary skills for an autonomous compatible workforce. A novel hypothesis generation and validation active learning scheme is also demonstrated in the discovery of simple composition/acidity relationships. DEVELOPMENT OF LOW-COST AUTONOMOUS RESEARCH SYSTEMS by Logan M. Saar Thesis submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Master of Science 2023 Advisory Committee Professor Ichiro Takeuchi, Chair Professor Ji-Cheng Zhao Dr. Aaron Gilad Kusne ©Copyright by Logan M. Saar 2023 ii Acknowledgements I am extremely lucky to have had so many encouraging and helpful people in my life over the past 2 and a half years, all of whom deserve appreciation for their support. To my research advisor, Dr. Ichiro Takeuchi, I am truly thankful for the opportunity I have had to receive your guidance and assistance throughout my research. It has been a pleasure to be mentored by you and I am grateful for the chance you gave me to implement my research in the classroom as an educational tool. To my other advisor, Dr. Gilad Kusne, I am equally thankful for your guidance and patience when helping me to overcome the hurdles in this project. To both of you, your constructive communication, patience, and friendliness has been a stellar example to me of not only how to conduct research, but also how to treat others well. Thank you to my entire research group and other students I have had the pleasure to work with during my time at university: you have made me feel welcome and helped me grow as an individual by sharing your knowledge and offering your support. Thank you to Haotong for the countless hours of helping me troubleshoot code and help students; to Felix for the valuable assistance in facilitating the use of the robots in the classroom, to Alex, whose introduction to the project and previous work set me up well for success; and the many others who have helped me in small ways - I hope to return the favor with my own support. I would like to give a big thanks to my family and my friends, who have always given me unconditional support, love, and loyalty. You have all carried me through this busy time in my life and always made it easier with your attention and presence. Thank you to my defense committee members for being so generous with their time. To Dr. JC Zhao, your support of this project and my education has been a true blessing, and I am extremely grateful for your confidence in me. To Adaire Parker, Dr. Isabel Lloyd, and the entire iii MSE department at UMD: thank you for presenting me with all the opportunities to give back to the program and for your assistance in helping me navigate my coursework. Finally, a big thanks to the National Institute of Standards and Technology for their financial support during the Summer Undergraduate Research Fellowship in 2021. iv Table of Contents Acknowledgements ......................................................................................................................... ii Table of Contents ........................................................................................................................... iv List of Tables ................................................................................................................................. vi List of Figures ............................................................................................................................... vii Chapter 1: Introduction ....................................................................................................................1 1.1 Perspective & Motivation ...................................................................................................1 1.2 Thesis Overview ..................................................................................................................3 1.3 Background ..........................................................................................................................5 1.3.1 Active Learning Schema .............................................................................................5 1.3.2 Relevant ML Concepts ...............................................................................................9 1.3.3 Autonomous Experimentation in Materials Science.................................................12 1.3.4 Educating an SDL Compatible Workforce ...............................................................14 1.4 Thesis Outline: Ch.2 - Ch.4 (Development and Operation) ..............................................24 Chapter 2: Systems Development .................................................................................................25 2.1 Design Principles ...............................................................................................................25 2.2 Mechanical Design.............................................................................................................28 2.3 System Construction & Preparation...................................................................................31 2.4 Connecting to & Calibrating LEGOLAS ...........................................................................39 2.5 Developmental Stages ........................................................................................................43 Chapter 3: Educational Implementation ........................................................................................50 3.1 Henderson-Hasselbalch Exercise .......................................................................................50 3.1.1 Introductory Exercises ..............................................................................................53 3.1.2 Autonomous Closed Loop Exercise..........................................................................57 3.2 Class Implementations .......................................................................................................62 3.2.1 Fall 2021 ...................................................................................................................63 3.2.2 Fall 2022 ...................................................................................................................63 3.2.3 Lessons Learned........................................................................................................64 Chapter 4: Autonomous Model Exploration .................................................................................67 4.1 Bayesian Optimization ......................................................................................................67 4.2 Hypothesis Validation Objective ......................................................................................70 4.2.1 Generating and Evaluating Candidate Hypotheses ...................................................73 4.2.2 Informational Entropy Acquisition Function ...........................................................76 Chapter 5: Future Work and Conclusions .....................................................................................81 5.1 Future Work & Scope ........................................................................................................81 5.2.1 Alternate Educational Exercises ...............................................................................81 5.2.2 System Modifications ...............................................................................................82 5.2 Conclusions ........................................................................................................................83 v Appendices .....................................................................................................................................84 I: Associated Media ...........................................................................................................84 Bibliography ..................................................................................................................................85 vi List of Tables Table 1.1 Summary of some recent low-cost SDL platforms …………………………… 19 Table 2.1 Cost Breakdown for a single LEGOLAS robot ……………………………. 32 Table 2.2 Summary of FDM printed parts ……………………………. 33 vii List of Figures Figure 1.1 Picture of LEGOLAS & its use in the Fall ‘21 ………………………………. 5 Machine Learning for Materials Science course at UMD Figure 1.2 Pictorial representation of Bayes Theorem for ………………………………. 11 iteratively updating our beliefs Figure 1.3 Three target foundational pillars for educational ……………………………... 15 development identified by an MGI working group Figure 1.4 Images of low-cost SDLs from Table 1.1 ………………………………. 20 Figure 1.5 Simplified pictorial representation for an SDL ………………………………. 23 such as LEGOLAS Figure 2.1 Experimental components on the trolley ………………………………. 29 Figure 2.2 Camera attached to the bottom of the Trolley ………………………………. 30 Figure 2.3 All 3D printed parts needed to build one ………………………………. 33 LEGOLAS robot Figure 2.4 Fully assembled stand ………………………………. 34 Figure 2.5 Fully assembled bridge ………………………………. 35 Figure 2.6 Assembled trolley resting on the bridge ………………………………. 36 Figure 2.7 Sample space setup for the ………………………………. 38 Henderson-Hasselbalch pH study Figure 2.8 Location of relevant wires and electronic ………………………………. 39 devices for LEGOLAS Figure 2.9 GUI calibration window and liquid-volume/ ………………………………. 41 gear-step calibration using a mg digital scale viii Figure 2.10 Example usage of fundamental movement ………………………………. 42 functions and a synthesis/measurement loop Figure 2.11 Liquid handling robot that inspired the ………………………………. 43 LEGOLAS design Figure 2.12 LEGOLAS (Generation I) ………………………………. 44 Figure 2.13 LEGOLAS (Generation II) ………………………………. 45 Figure 2.14 Classroom setup for Fall ’21 implementation ………………………………. 46 Figure 2.15 LEGOLAS (Generation III) ………………………………. 47 Figure 2.16 LEGOLAS (Generation IV) ………………………………. 48 Figure 2.17 Students working on LEGOLAS exercises ………………………………. 49 during the Fall ’22 implementation Figure 3.1 The Henderson-Hasselbalch equation ………………………………. 51 Figure 3.2 Percent error in the Henderson-Hasselbalch ………………………………. 52 Equation as a function of sodium hydroxide concentration Figure 3.3 Gaussian process demonstrated for 5 samples ………………………………. 56 Within a composition range of %acid = [5-95] Figure 3.4 Gaussian process evolution (RBF Kernel) for ………………………………. 61 First 5 experimental samples with an exploration-based CO Figure 4.1 Sampling of potential models across the prior ………………………………. 69 distribution determined by the kernel Figure 4.2 Probabilistic interpretation of the GP for ………………………………. 69 uncertainty quantification and propagation Figure 4.3 Effect of new data on GP structure and ………………………………. 70 ix a purely exploration-based acquisition function Figure 4.4 Process-flow for experimentation, hypothesis ………………………………. 75 generation, and evaluation Figure 4.5 Process-flow for creating GMMs over sample space ……………………. 78 Figure 4.6 Using the informational entropy metric to ………………………………. 80 select the next composition Figure 5.1 Camera images of sample wells from the color ………………………………. 81 mixing study Figure 5.2 Analog aqueous conductivity probe, ………………………………. 82 spectrophotometer, and heating element Chapter 1: Introduction 1.1 Perspective & Motivation In the past 50 years, automated systems - systems that perform tasks with little or no human control - revolutionized the manner in which society operated [1]. These new technologies increased productivity and efficiency by automating repetitive and precision based tasks; tasks for which humans were more poorly suited [2]. These automated systems, created by and for humans, greatly improved society’s ability to produce on a large scale [1,2]. Inherent to this revolution was a shift in the needs of the workplace. On one hand, there was a new demand for people to develop and improve the automated systems themselves [3]. On the other hand, there was an increase in demand for workers - humans - who could tend to, interact with, and understand both the hardware and software elements of the automated systems [3]. At all levels of education, classes that taught the fundamentals of automated systems - from computer programming to robotics - began receiving a greater degree of emphasis [1,3]. These were all efforts to equip the next workforce with skills necessary to coexist with automated systems and thrive in an automated world. For materials science, this automated revolution expanded the ability to explore and experiment with a larger space of compositions, processing parameters, and performance metrics [4]. In large part, the substantial successes of combinatorial and high-throughput experimentation (CHT) were made possible by automated systems that accelerated the synthesis and characterization steps [4]. However, in automated systems, the “lead scientists” - the experiment directing field-specific human experts are never supplanted by the robotic task-performing machines [4]. Incapable of straying beyond the means of their typical tasks - even if preprogrammed to act differently in certain situations - the automated systems could not 1 take on the intuition based activity of understanding the world through the method of scientific process: observation, hypothesis, and experiment [5,6]. At most, they played a role in this process, as a tool which could facilitate and accelerate experiments. In the past few decades, with the advent of machine learning (ML) and the broader field of artificial intelligence (AI), it has become possible to merge these new technologies with automated robotics into what are known as autonomous systems. Capable of approximating human intelligence - mainly the faculty of learning and adapting - these systems were endowed with adaptive algorithms for responding to varying stimuli in a non-preprogrammed, non-predetermined fashion. To what degree AI & ML constitutes intelligence with respect to our human definition of the word is already a subject of much debate, but what is clear - from their widespread adoption (albeit still mostly in development stage) and rising popularity - is that they are going to play a large role in the world of the future [1,7-9]. Self-driving cars, autonomous factories, autonomous drones, the internet-of-things: all examples of a data-driven world in which machines have begun to usurp certain previously human-performed decision making tasks [10,11]. The pervading question, however, is to what degree could these autonomous systems function in roles where the workflow is not only highly complex and dynamic, but also not clearly defined or understood by humans themselves [5,12-14]. How may an autonomous system perform within a profession in which the role can be ambiguous, dependent on insight, and itself a method of discovery, say, as that of a “lead scientist?” [5,7,8,12,15,16]. To consider this question, as well as work to better illuminate what a “lead scientist” really does, is one of the essential steps in preparing the ground for autonomous experimentation (AE) as a fruitful field for cultivation [5,15,17]. A second related question that arises is: what 2 role will humans continue to play in science, and thereby what skills & fundamentals will be useful for them to cultivate to continue thriving in this changing world? [18-21] This thesis attempts to approach and begin to answer both of these questions. In the background section, the fundamentals of ML, AI, and active learning will be discussed in terms of their implementation in AE. Next, the merit of AE will be discussed with respect to prior successes and challenges within the field of Materials Science and Engineering (MSE). We present a LEGO-based low-cost autonomous scientist as an embodiment of an autonomous science kit to be used for educating the next generation workforce. The requirements for an autonomous compatible workforce of human researchers will be discussed, focusing on the skills needed and methods of teaching those skills. Additionally, the role of a “lead scientist” and the ability to potentially encapsulate the scientific process of hypothesis generation and testing within an AE system’s “intelligence” will be considered. 1.2 Thesis Overview This thesis describes my development of low-cost robotic systems for closed-loop AE. These systems are designed to be affordable, modular, and simple to operate to ensure usability at all educational levels. They are composed of inexpensive components, including LEGOs, to allow for the affordability and modularity desired. From here on out, they will be referred to as LEGOLAS: LEGO based L ow-cost A utonomous S cientist(s) ( Figure 1.1 ). I began development of LEGOLAS during the Summer Undergraduate Research Fellowship (SURF) program at the National Institute of Standards and Technology (NIST) in 2020. The first task was building the robot and producing a live-run with a purely exploration based campaign objective ( Section 4.1 ). Some figures shown in this section are from the 3 resulting colloquium presentation provided at the end of that Summer [22]. I was then able to further develop the Autonomous Model Exploration modules ( Section 4.2 ) while working as an undergraduate research assistant in the ML group at UMD (co-led by Dr. Ichiro Takeuchi at UMD and Dr. Gilad Kusne at NIST). Many of the figures in Sections 4.1 & 4.2 are from a presentation I gave in December of 2021 at the Materials Research Society (MRS) Symposium on Accelerating Experimental Materials Research with Machine Learning [23]. We were then also able to submit a publication on this work to MRS Bulletin in May of 2022 (Published November 2022) [24]. Further publications on the use of LEGOLAS in the classroom for solid-state materials research, and multi-objective studies, are expected in the coming months and years. Chapter 2 describes the design principles and methods I used in determining how to construct, operate, and interface with LEGOLAS. Chapter 3 then reviews the use of LEGOLAS in two undergraduate / graduate courses for Machine Learning for Materials Science (ENMA 437/637) offered at the University of Maryland (UMD) ( Figure 1.1 ). In these classes, I was the lead teaching fellow and designed the exercises for completion, which served as the final project for students. Chapter 4 discusses the use of LEGOLAS to perform novel on-the-fly hypothesis generation and validation using symbolic regression and an acquisition function rooted in the use of informational entropy as a metric of uncertainty [25,26]. The process of active learning development displayed in Chapter 4 is meant to be a testament to the ability of LEGOLAS to facilitate a creative and responsive learning environment in which the encoding of patterns of human “intelligence” into AE systems is more easily achieved. Chapter 5 talks about the future for LEGOLAS and the educational opportunities it provides. 4 Figure 1.1: (left) Picture of LEGOLAS & (right) its use in the Fall ‘21 Machine Learning for Materials Science course at UMD 1.3 Background 1.3.1 Active Learning Schema Active Learning is itself a sub-field of ML, and its application in AE systems is what distinguishes them from the rigid, prescriptive behaviors of purely automated systems. AE systems are flexible and capable of guiding and altering experimental design - that is, the choice of experimental prerequisites (composition, processing parameters) and/or characterization schemes (means of collecting feedback/data) - to achieve some user defined “goal” [27] These goals, known as campaign objectives (CO), may be divided into roughly 3 main categories: exploration, exploitation, & mechanistic study [27]. These COs will be described in more details later but briefly: - an exploration CO aims to “explore” a large amount of sample space, prioritizing the investigation of areas in which we have little information (i.e, the unknown), - an exploitation CO - also known as an optimization CO - aims usually to maximize or minimize some response metric (feedback or performance measure) and thereby 5 investigate areas for which we believe the optimal metric could be found (i.e use the known to make a prediction), - and a mechanistic study CO aims to either determine or attribute an underlying functional form to the data - a relation between input parameters and output values - with the highest possible confidence, always considering that this form is selected amongst a sea of other potential descriptive forms, In reality, most active learning schemas are a combination of these COs, and will implement them together or in succession depending on the nature of the experiment. As an example, in a hypothetical materials “discovery” experiment, it may be beneficial to first explore the composition space, conducting initial experiments into unknown areas of sample space, and then - once some of the sample space is known - shift to an exploitation CO in which subsequent experiments search at or nearby established maxima/minima. In this way, the active learning schema provides an “intelligent” means of searching through the sample space, potentially avoiding the headache of investigating all samples (i.e. exploring the entire time), and additionally avoiding the pitfall of trying to optimize the property when nothing is known (i.e. “shooting in the dark,” or exploiting from the start). This idea of alternating between COs in an active learning schema is known as “scheduling.” It is important to note that in all these COs, we are typically basing our decision making criteria on some idea of the assumed value of running a certain experiment. The various methods for quantifying this value and using it for experimental guidance are at the heart of active learning, and will be elucidated in more detail when acquisition functions are discussed. 6 When active learning is implemented in AE towards the achievement of these CO’s, it is often in pursuit of a reduced experimental cost. This “reduced cost” can constitute several meanings depending on the constraints and nature of the specific experiment: - (1) acceleration of the process due to a lack of funds required to carry out experiments or time, - (2) reduction in number of experiments due to a lack of funds, time, or resources, - (3) reduction in number of experiments due to a prohibitively large sample space Reasons (1) & (2) are quite straightforward when considering costly and time-restricted studies - for example, those involving synchrotron beam time or expensive constituents [28] - and reason (1) is to some degree satisfied partially by automated experimentation systems (for example, those of combinatorial experimentation studies) since they can reduce sample transfer time, human error, and other associated slowdowns with human-executed processes [4,12,29-31]. Reason (3) arises from the exponential growth in the number of “combinations” of possible experimental conditions with the addition of each processing/composition variable [7,27]. For example, the exhaustive study of 10 compositions at 10 temperatures would only necessitate 100 experiments. Upon additionally investigating 10 pertinent pressure values and the inclusion of a new constituent at 10 concentrations, there appear now 10 4 possible experiments. This is often prohibitive for the implementation of an exhaustive or “grid” search of the sample space, and although it is quite obvious that an Edisonian approach would be inapt, it is also true that a purely automated grid-search may not be appropriate given the time/money costs [27]. This “curse of dimensionality” problem is highly prevalent in materials science, especially in the fields of materials discovery and property optimization since there are a large number of potential constituents (considering binary, tertiary, and quaternary compounds consisting of 7 elements selected from across the periodic table), a large number of processing parameters (temperature, pressure, etc.), a large number of structural considerations (film thickness, growth technique, crystal structure, etc.), and a large number of relevant property metrics (optical, electrical, mechanical, etc.) that affect any target performance criterion [14,27]. When considering the complex web of process/structure/property/performance relationships, is there a way to “intelligently experiment” to reduce the necessary number of experiments to achieve a CO; and how should one do this? This is not a new question, and the Design of Experiment (DoE) field has existed to address the large sample space problem for many decades [32]. This field is concerned with optimal experiment design when under resource or time constraints, and aims to identify causal variables. It may explore relationships between independent and dependent variables in a multifactorial fashion, alter experimental design based on collected results (sequential analysis), and use randomization - among other techniques - to achieve this [33]. In AE, active learning takes the place of DoE [32]. Specifically, the quantification of value for a potential experiment - as mentioned earlier - is done through what is known as an acquisition function . In the closed-loop AE process flow, an acquisition function is generated and updated after each successive data point is measured, and signifies the value metric that each new potential experiment “should” have ( Section 1.3.2 for ML background ). The next experimental conditions can be chosen on the basis of the optimal value metric [27]. An acquisition function is meant to adapt and change after each successive data point is collected, reflecting a new “mindset” from which the robot works with as it incorporates new data. This is roughly akin to a human taking interest in guiding the experiment a certain or different way once they have conducted some runs/trials - something that purely automated systems are incapable 8 of achieving. The degree, however, to which this characteristic truly embodies the ineffable process of human intuition in guiding experiment is a subject of much debate [5]. Nonetheless, the acquisition function is the next logical step towards approximating the process of our adaptable intelligence and intuition. Since, inherently, the acquisition function is a supposition of experimental value it is of great importance that acquisition functions be modular, flexible, and to whatever degree possible, informed [16,34]. Without these characteristics, it is easy for the acquisition function to be ill-suited for providing a real benefit towards intelligently searching the space, and in fact certain comparative studies have shown little to no benefit at all from relatively uninformed active learning schema in guiding certain experiments [9]. To inform an acquisition function means providing it with prior data, physics axioms and principles (as in physics-informed AE), or even - in the case of human-in-the-loop AE - some human supervision to ensure that it doesn’t become too unreasonable or counterproductive in its experimental suggestions [34,35]. Reasonableness, however, is quite hard to quantify, and to a large degree the predominant intention for implementing active learning/AE guidance in certain fields is that humans themselves may be ill-suited toward - and sometimes biased against - seeing the data-driven “signs” that point towards valuable next-experiments [27,36]. In summary, active learning, like any tool, has its pros and cons, and it should be the goal of any teaching program to develop a critical thinking mindset for distinguishing between its pros and cons so that it is properly applied in a useful manner ( Section 2.1 for Design Principles ). 1.3.2 Relevant ML Concepts The ability to autonomously guide experimental design, as discussed in Section 1.3.1 , relies on a quantification of supposed value as a function of experimental prerequisites, which in turn emerges from the underlying ML framework utilized. In this thesis, the predominant ML 9 frameworks used are that of Bayesian statistics and Gaussian processes. In our application of ML to active learning for AE, the task will often be of regression: minimization of the discrepancy between our models predictions (i.e., our beliefs) and the observed experimental data (i.e., reality). Our predictive models in the case of parameterized models may rely on the selection and tuning of a functional form, or in other cases (such as with Gaussian Processes) lack the need for an explicit functional form but instead require the selection of an appropriate kernel, which is essentially a covariance function [37]. The utility of Bayesian statistics with regard to active learning arises from its ability to quantify model uncertainty [37,38]. As discussed in Section 1.3.1 , some COs (exploration, exploitation), explicitly require some encoding of the concept of what is known or unknown (i.e. where we are confident in our beliefs, and where we are not). These concepts can be conflated with the quantification of uncertainty in our model [37,38]. The Bayesian statistical approach is based on the work of Thomas Bayes, and is exemplified by Bayes theorem ( Equation 1.1 ) [38]. (Eq. 1.1)[38] ( | ) = ( | )* ( ) ( ) This equation is intended to assist in producing the posterior probability [ ], a conditional ( | ) probability that represents the probability of event A given event B is true [38]. We rely on inputs of a prior probability [ ], which represents the initial beliefs about the probability of ( ) event A, and a marginal probability, [ ], which may be expanded into constituent terms to ( ) allow for calculation [38]. The Bayesian framework is used often in the analysis of positivity rates in medical testing, but to see its application to active learning, it may be rewritten ( Equation 1.2 ) [38]: (Eq. 1.2)[38] ( | ) = ( | )* ( ) ( ) 10 In this case we can presuppose a predictive model, and our posterior represents the probability of our predictive model being true given our observed data. If we recognize that our predictive model is a continuous explanatory function across our sample space, we can see that we actually calculate a posterior distribution across this same space [37,38]. This may be a corollary to a confidence distribution for our model (i.e. where we are more and less certain about its predictive capabilities and whether or not it is “true”). In addition to inputting a predictive model, we must also input a prior (which represents our initial beliefs/confidence for our model) [38]. Our other conditional probability [ ] (which represents the likelihood of observing this data, ( | ) assuming our model is correct) is the main metric that is updated as we gain new data. In this way, Bayes Theorem can be used iteratively between closed-loop runs (synthesis & measurement) to constantly update our prior and use the calculated posterior to construct our acquisition function (based on quantification of uncertainty) ( Figure 1.2 ). Figure 1.2: Pictorial representation of Bayes Theorem for iteratively updating our beliefs Two of the active learning techniques implemented on LEGOLAS to study the simple chemical system described here ( Chapter 3 ) involve Bayesian ML to some degree: Bayesian optimization using Gaussian processes ( Section 4.1 ) and Bayesian Inference for parameter refinement in preselected models (not discussed in this thesis). The third technique - Hypothesis 11 Generation and Validation - utilizes an informational entropy based approach for quantification of experimental value (i.e. the acquisition function) ( Section 4.2 ). To produce potential hypotheses, several candidates are either preselected by the user or generated via a symbolic regression package (genetic programming) ( Section 4.2.1 ). Fitting of these candidate hypotheses occurs either through simple non-linear least squares regression, or automatically within the symbolic regression functions. Hypothesis validation is accomplished by evaluating one of two possible metrics, both of which promote fitness and penalize complexity: Bayesian Information Criteria (BIC), and M.S.E./Complexity score ( Section 4.2.1 ). The fundamental statistical workings of symbolic regression will not be described in this paper, but more can be learned via sources in the bibliography [25,39]. Only the Bayesian Optimization active learning schema is currently taught in the Machine Learning for Materials Science course (ENMA 437/637) at the University of Maryland, where LEGOLAS is used as a teaching tool, although other techniques may be introduced in future implementations ( Chapter 3 ). 1.3.3 Autonomous Experimentation in Materials Science Many thorough reviews regarding the successes, experimental extents, and scopes of AE in theoretical, computational, and experimental materials science have been published in the past decade [7-9,12-14,16,18,27,29]. The use of AE for exploration, optimization, and discovery of materials systems has been already proven in many areas of materials science: reducing the number of neutron diffraction experiments for determination of transition temperatures of magnetic materials [28], incorporation of physics intuition and structural phase mapping to accelerate the discovery of best-in-class phase change memory materials [34], autonomous study of processing parameters to optimize carbon nanotube growth rate [40,41], autonomous optimization of mechanical properties of 3D printed structures [42], accelerated discovery of 12 new metallic glasses [43], active learning for tuning of interatomic potentials in atomistic simulations [44], among many others [29,45-49] At the heart of it, the greatest misconception surrounding AE is that it could elicit an elimination of humans from the scientific process altogether [12,14,27]. In all these studies mentioned, however, a great deal of human guiding, monitoring, and encapsulation of human intuition into ML frameworks was required to create meaningful results [29,34,40-49]. The hopes of those developing AE systems are that they could (1) free up time for experts by alleviating them of laborious and repetitive laboratory procedures, (2) equip them with data-driven methods of navigating within highly dimensional sample spaces perhaps beyond the scope of human intuition & theory based analysis, and (3) forgo analytical delays between experiments to allow accelerated exploration with automated equipments. The field of AE itself has its beginnings in Biology and Chemistry, where workflows generally involve more liquid handling and are less complex [17,50-52]. In the field of Materials Science, workflows may require high amounts of dexterity, and can be extremely complex and highly dissimilar from case to case [13,29]. Insights from thermodynamic phase stability, structural considerations, and defect equilibria (among other confounding factors) have proven necessary for making some sense of the inherently high-dimensional and highly-complex search spaces (process/structure/property/performance relationships) [9,34,35]. The call to utilize autonomous exploration techniques is therefore warranted, yet it is further complicated by expensive and complex characterization techniques [12-14]. Due to these restrictions, the use of Self-Driving Labs (SDLs) - controlling synthesis, measurement, and analysis functionalities - has been limited to universities and groups with large funding sources, resulting in highly specialized and extremely expensive setups that are not easily accessible to the majority of research groups, 13 and completely out of reach for most educational instruction purposes [49,53,54]. A few commercial modular SDL platforms are on the market, but again many of them cost hundreds of thousands of dollars, leading to a lack of availability for educational institutions [53,54] In order to fully embrace SDL technologies, a multitude of expertise over many domains is needed including computer science, decision-making theory, experimental science, machine learning, statistics, materials science, systems engineering, robotics, software design, etc [19,55]. An ideal research group working in SDL technologies might consist of experts in various fields, but it is quite unreasonable to expect any one person to possess the utmost expertise in all necessary domains. An SDL compatible workforce should at least be conversant in these fields and understand the basics of the various fields [19,55] ( Section 1.3.4 ). For developing a manufacturing workforce in industry to manage autonomously operating factories, the same is true [19,20]. Additionally, since collaboration is required to achieve the large-scale goals of autonomous manufacturing or experimentation, teamwork should be emphasized in developing the workforce [19]. One of the main barriers to the expansion of AE as a field, however, is the lack of availability of lower-cost and available teaching platforms and curricula for the education of an SDL compatible workforce [12,13,18,19,27,55] ( Section 1.3.4 ) . 1.3.4 Educating an SDL Compatible Workforce The Materials Genome Initiative (MGI) is a large-scale multi-agency federal initiative launched in 2011, with the goal of “deploying advanced materials twice as fast and at a fraction of the cost compared to traditional methods” [55]. The MGI was developed in response to the 10-20 year delay time from materials discovery to commercial implementation that handicaps our ability to respond to existential societal threats such as climate change and sustainable energy needs [7,55]. In 2021, the MGIs strategic plan consisted of three main goals, one of which was 14 to “Educate, Train, and Connect the Materials Research & Development Workforce” [19,55]. From its conception, MGI set out to ensure our workforce was “trained for careers in academia or industry, including high-tech manufacturing jobs”[55]. The first objective to tackle in achieving this goal was to “Address Current Challenges in R&D Education'' at the undergraduate, graduate, and foundational (K-12) levels. The second objective was to “Train the Next-Generation Workforce,” which includes mid and late career post-graduates, manufacturing workers, and other industry employees [20,55]. In 2019, an MGI working group identified 3 foundational pillars and their associated necessary competencies: data management, computation, and experimentation ( Figure 1.3 ) [19,55]. Figure 1.3: Three target foundational pillars for educational development identified by an MGI working group[19] The MGI working group emphasized that “students need not be experts in all three areas but should be conversant in multiple topics across this spectrum” [55]. They also pointed out that the contrasting vernacular and cultures of theorists, experimentalists, computationally scientists, and other contributing experts necessitated a convergent educational preparation in order to provide the optimal cross-pollination between groups. Teamwork - a critical aspect of large scale success - was emphasized as well [55]. 15 Existing educational solutions for the development of MGI-related skills have touched upon perhaps 1-2 of the foundational pillars, but rarely all three at once. For instance, at the graduate level, the Data-Enabled Discovery and Design of Energy Materials (D 3 EM) at Texas A&M University provides a wonderful preset track for the convergence of data informatics, computational methods, teamwork, leadership, and materials science fundamentals [56]. However, it currently lacks emphasis on automation and autonomous technologies: a crucial aspect of experimental acceleration techniques [56]. Although other established programs at the undergraduate and graduate level - as well as informational bootcamps and summer programs - have also begun to incorporate ML and active learning techniques into their curricula, they currently lack an available tangible platform from which to experimentally teach these techniques [56-61]. For the Machine Learning for Materials Science graduate/undergraduate course (ENMA 437/637) taught at the University of Maryland, this has been a challenge since the inception of the course in 2019. The majority of the course modules focus on ML and applied ML techniques, and at the end, when active learning is discussed, it was observed that many students lacked the ability to fully incorporate active learning as a decision making engine (either in the class assignments or into their own applied research). In other developing fields such as drone technology and self-driving cars, where there is also a large emphasis on autonomous development, they have succeeded educationally by implementing tangible platforms in the classroom such as self-driving RC cars with multitudes of sensors and autonomously flying drones [62-66]. It was observed that the use of real platforms helped identify and remedy deficiencies in understanding of the robot point-of-view (POV) (i.e. what the robot knows vs. what the human knows), robotics hardware proficiency, coding and data management skills, and teamwork capabilities [66]. In developing these skills, 16 students became more adept at distinguishing between realistic and useful applications of autonomous control, and those that were “impractically ambitious”[66]. In this way, the students came to fundamentally deconvolute the seemingly nebulous concepts of AI, ML, and autonomous control to facilitate practical applications [66]. Although the use of simulated experiments or previously collected data can be helpful in teaching the data-driven side of autonomous systems, it can also lead to a lack of proficiency and understanding from the experimental side. It is a goal of MGI to have students be at least conversant in all of these fields [19,55] Some of the tangible and hands-on platforms for autonomous drone technology were only a few hundred dollars and composed of widely available DIY robotics kits (Raspberry Pi, Arduino, etc.) [64]. However, in the case of autonomous driving, the F1Tenth program noted that their price per system (nearly $5,000) was cost prohibitive (considering a class typically needed 10 systems) for university programs, and certainly at the foundational (K-12) levels [62]. Therefore, for maximum educational impact, it should be a goal of any tangible platform to be low-cost and affordable ( Section 2.1: Design Principles - Affordability ). At the same time, one of the main benefits of a hands-on educational tool is ultimately the captivation of student interest. It should also be a goal of the system to entice students and motivate them with exciting and fun challenges ( Section 2.1: Design Principles - Usability & Modularity ) [13]. In the case of the F1Tenth car, this was achieved by replacing exams with racing events [62]. Lastly, one key aspect of all tangible platforms is that they attempt to accurately portray “real-world” autonomous situations ( Section 2.1: Design Principles - Transferability ). For hands-on educational platforms, there must be a careful balance between system complexity and usability, so that the majority of students are not overwhelmed nor bored during its use [66]. 17 Although SDLs are being developed at research institutions for fruitful materials science studies ( Section 1.3 ), these bespoke systems are typically hundreds of thousands of dollars, so that they are neither available nor abundant enough for widespread educational use [13,49,53,54], Several groups in the past decade have seen an opportunity to produce tangible SDL platforms for teaching at a wide range of costs ($100 - $10,000) ( Table 1.1, Figure 1.4 ) [67-73]. Typically, these are cartesian style robots that use syringe pumps or peristaltic pumps to manipulate liquids, and perform characterization through contact measurements or via computer vision techniques (composition-color relationships, composition-acidity relationships, and even autonomously refining cocktails based on human feedback) [67,69-71,73-79]. There are also examples of autonomous 3D print optimization schemes, networking robots, and some attempts at solid-state experiments (using powders or wax) [68]. Table 1.1 shows the strengths and drawbacks of each of these systems. Some SDL platforms are high quality, modular, usable, and transferable, but simply too expensive for educational use [67,72,76,77,79]. On the other hand, some are extremely cheap, but lack modularity, appeal, and most importantly transferability when assessed along the MGI emphasized competencies [19,55,68-70]. 18 Table 1.1: Summary of some recent low-cost SDL platforms. 19 Figure 1.4: Images of low-cost SDLs from Table 1.1 In addition to the SDLs shown here, there has also been a rise in so-called “cloud labs” which allow students or collaborators to interact with experimental hardwares from a distance, in some cases using AI to facilitate complex manufacturing processes [54,80,81]. Although these technologies have the potential to revolutionize certain repeatable synthesis routes and improve upon the “crisis of reproducibility” increasingly apparent in recent human produced studies, they 20 are likely not good tools for educational purposes because they distance the student from the tangible aspects of experimentation [12,29,30,31]. In addition, it is clear from reviews and analyses of many SDL based setups and experiments that it is highly unlikely that humans will be completely removed from the physical laboratory[9,12,27,30]. In nearly every study it can be seen that a human is eventually needed - to replace experimental consumables, to troubleshoot hardware issues, etc. - and at best the “closed-loop” can only operate for so long before it needs tuning [13,15,17,29,46,51]. Additionally, thorough knowledge of experimental hardwares contributes to the reduction in “impractically ambitious” expectations often inherent to pure theorists [62]. Therefore, it should be a definite goal of educational platforms to continue to instill the importance of working with your hands and with tools on the equipment (Section 2.1: Transferability). Of the best SDL platforms, ( Table 1.1 ) the Chemorobotic Robot - essentially a liquid-handling cartesian gantry - was identified as the optimal option due to the incorporation of 3D printed modular parts, a captivating application, smaller profile, and robust design [79]. However, due to the incorporation of syringe pumps and bespoke parts, this SDL had an elevated cost - although cheaper syringe pump alternatives now exist [82] - and required a complicated construction process [79]. For the work represented in this thesis, a LEGO based chemistry robot design for STEM education ( Figure 1.1 ) was developed to facilitate much of the same functionality at a reduced cost (~$950: For a full breakdown of parts and costs see Section 2.3 ). This was done through 3D printing many of the parts, as well as replacing the syringe pumps with a simple plastic syringe. Computer vision and contact measurements were made possible by incorporation of a low-cost ($30) reliable pH sensor and a small USB Camera [83,84]. 21 LEGOLAS is a cartesian style gantry robot that can be expanded beyond the tasks elucidated in this thesis due to its high degree of modularity ( Chapter 5 ). The design principles followed for creating LEGOLAS are described in detail in Section 2.1 , but briefly, the incorporation of LEGO parts was intended to elevate student interest in the project, remove the stigma of complexity from the physical system (so non-experts could learn as well), and allow for low-cost modularity (i.e. creative students could shape the course of the robotics design quickly and cheaply). By simplifying the experimental setup and reducing its costs, LEGOLAS facilitates financially achievable adoption at K-12, undergraduate, graduate, and industrial education levels, and effectively prepares students for the following competencies: MGI Foundational Pillar 1 (Data) - Data Handling: Students manage upcoming measurements and manipulate data structures to facilitate ML analysis. ( LEGOLAS Challenge 1 - 3 in Chapter 3) - Software and codes to manage MG workflows: Students embrace workflows in the easy-to-learn yet complexity tolerant Python language, where they manage both robotic control functions and active learning analysis in live-run closed loop AE fashion. ( LEGOLAS Challenge 4 in Chapter 3 ) MGI Foundational Pillar 2 (Computation): - Microstructure Evolution and material response: Students study the relationship between composition in acidity in the main challenges described here. In a future exercise being developed for LEGOLAS ( Chapter 5 ), students may be able to use computer vision to study the solidification kinetics and resulting growth structures of precipitating salt crystals in solution. 22 MGI Foundational Pillar 3 (Experiments) - Multi-objective design and decision-making under uncertainty: Students learn to quantify uncertainty using Bayesian ML techniques, and guide multi-objective optimization studies involving color mixing, solidification kinetics, and acidity. - Measurement methods and tools: Students work hands-on with an electrochemical probe for measuring pH, calibrate the sensor, and build an understanding of the noise inherent to it ( Chapter 3 ). In addition, they must calibrate the synthesis syringe using a mg sensitive scale and calibrate the sample well locations. - Sensor fusion, high-throughput methods, and automation: This is LEGOLAS strongest strength, combining synthesis, measurement, and active learning into an autonomous closed-loop AE cycle ( Section 3.1.2 ) ( Figure 1.5 ). Figure 1.5: Simplified pictorial representation for a SDL such as LEGOLAS 23 1.4 Thesis Outline: Ch.2 - Ch.4 (Development and Operation) An instructional overview of the construction and operation of LEGOLAS is outlined in Sections 2.2 & 2.3 , with links to more detailed instructions therein. Worksheets and challenge templates for educational implementation are overviewed in Chapter 3. Code used to implement the active learning techniques for autonomous model discovery are outlined in Chapter 4 . All detailed instructions files, worksheets, challenge templates, and code for any acquisition functions described here can be found centrally at the LEGOLAS github page . Inquiries for obtaining additional software modules for relevant chemistry experiments, challenge worksheets for students, active learning code, and machine learning code can also be requested via email from: takeuchi@umd.edu , agiladk@gmail.com . 24 Chapter 2: Systems Development 2.1 Design Principles LEGOLAS is designed with several overarching principles in mind: affordability, usability, modularity, and transferability. These principles are held paramount for the purpose of achieving maximum educational impact. A large educational impact can best be realized through a system that is widely accessible and helpful in realistically portraying the benefits, drawbacks, simplicities, and challenges of closed-loop AE. It should also be a goal of the educational tool to promote critical thinking - with respect to active learning and experimental design - and not merely a prescriptive carrying out of established steps and rules ( Chapter 3 ). Most closed-loop AE systems are currently cost-prohibitive, leading edge research tools that are available only to a small, select few scientists and researchers who are far along in their careers [13,30]. A low-cost, easily reproducible AE system encourages the instruction of AE fundamentals at earlier ages; facilitating implementation in high-school, undergraduate, and graduate education. It is likely true that an early introduction - and thereby a wider berth - for the AE field could only serve to enliven and further enrich its goals of improving our ability to produce societally beneficial materials [7,13,30]. Usability of an educational AE system refers not only to its ability to operate reliably and smoothly, but also to its propensity for the captivation and inspiration of student interest [13,62,66]. LEGO components were selected to elevate this level of interest, as well as reduce the stigma of complexity that can often serve to overwhelm certain students. These LEGO components, however, are by no means of poor quality, and provide the robot with structure, modularity, and a pleasant aesthetic quality. All aspects of the design embody the principles of interactive usability and simplicity: manual controls on motors, open and accessible joints, 25 removable trolley & bridge, light-weight & sturdy aluminum frame, Arduino & Raspberry Pi integration, remote Wi-Fi connectivity, and Python based Jupyter Notebook control. The modularity of an educational AE system is important to ensure it can be extended beyond its original mode of use into new and fresh domains for exploration by creative students [13,30]. Physically, the LEGOLAS system described here is quite modular: 3D printed (and thereby replaceable) components, semi-permanent joining methods (LEGO pins, machine screws), LEGO parts & motors that can be rearranged for different types of experiments, and an open frame and trolley design providing space for additional sensors, probes, cameras, and other experimental hardwares. These attributes allow LEGOLAS to be expanded into new experiments beyond the ones described here ( Chapter 5 ). With respect to the software side of an AE education system, ideally there should be a control interface that is relatively easy to learn, robust, and tolerant of complexity [30]. For LEGOLAS, a Graphical User Interface (GUI) has been developed to reduce experimental calibration complexity. Coding in the easy to learn & widely supported Python language (Jupyter Notebooks) allows for both beginner and experienced programmers to get what they need out of the platform. Within Python, community supported active learning/ML and robot control functions can coexist, allowing students to visualize the process-flow of closed-loop AE quite well. Finally, the aspect of transferability - albeit hard to quantify - is key to setting up an educational AE system for maximum impact. Transferability refers to the educational AE system’s ability to accurately approximate the challenges, drawbacks, and benefits of a larger-scale research grade AE system in a way that will prepare students for the “real thing.” The key fundamentals of research level closed-loop AE will be elaborated upon later ( Chapter 3 ), but briefly, they are: 26 - (1) the skills of working with your hands & with tools on hardware, - (2) the knowledge to interface with motors, sensors, & robotic components, - (3) the patience and attention to detail to properly calibrate and tend-to the machine, - (4) the knowledge of ML techniques for prediction, clustering, etc. - (5) the skills of programming and data manipulation - (6) the insight to encapsulate research “objectives” into active learning schema, - (7) the foresight and knowledge to simulate live-runs, - (8) the skills of presenting data in a digestible & meaningful fashion, - (9) and the discernment to understand pros & cons of AE. These are skills that are often distributed amongst a team of specialists in larger-scale research-grade AE groups, but all are necessary to ensure a successful, and meaningful use of AE for research purposes [19,66]. By possessing many or all of these skills, an individual has a greater ability to integrate their research goals and experimental processes to achieve relevant applications. However, nearly all high-level research occurs in group settings, so it should still be a goal of an educational AE system to promote teamwork [19]. In LEGOLAS, the principle of transferability is largely affected by the methods in which it is used/taught, rather than any physical characteristics of the machine itself. This will be discussed more in the educational section ( Chapter 3 ). However, certain aspects of the LEGOLAS system - as with any machine - are liable to occasionally malfunction. These challenging moments happened to provide some of the greatest learning benefits for any of the students, since, in reality, no AE system is truly a “closed” loop (humans will need to periodically intervene) [13,30,46,51]. 27 2.2 Mechanical Design LEGOLAS is a cartesian gantry style robot with a frame track (y-axis), a bridge track (x-axis), and a trolley cart that rests on the bridge. Each assembly contains an assortment of LEGO components, aluminum frame extrusions, electronics, and/or 3D printed parts. Trolley, bridge, and frame may be constructed from the constituent components with common tools and adhesives ( Section 2.3 ) and are easily removable for maintenance or alterations. The X & Y position of the trolley cart can be controlled via keyboard or manually through control knobs ( Figure 2.1, 2.5 ). Force sensors aligned along the X & Y axes allow for automatic recalibration of relative experimental coordinates (i.e. the location of the sample wells or reservoirs) in the case of the bridge or trolley being derailed or dislodged. The trolley cart contains all experimental components (pH Sensor, Syringe & Plunger, and/or camera), which are free to move in the Z-direction ( Figure 2.1, 2.2 ). The LEGOLAS was inspired by a fully LEGO based liquid-handling robot developed for chemistry experiments [85] ( Figure 2.11 ). I have developed 4 generations of LEGOLAS since 2019, attempting to create each new generation at a lower cost, higher stability, and with greater interactivity possible. The chronological development and modifications made for each of these models is shown in Section 2.5 . To better visualize the operation of LEGOLAS ( Generation III ), one can see it performing experiments in this video [86]. 28 Figure 2.1: Experimental components on the trolley ( Generation IV LEGOLAS). Axes of motion ( double sided arrows ) and manual control knob locations ( rotational arrows ) shown. 29 Figure 2.2: USB Camera attached to the bottom of the Trolley (Generation IV LEGOLAS). The camera can easily be attached and removed for different types of experiments/studies. 30 2.3 System Construction & Preparation This section contains the general process flow of sourcing components, 3D printing parts, constructing LEGOLAS, and preparing the experimental consumables needed for the simplistic chemistry experiment described in Chapter 3 . Guidance to in-depth instruction modules are provided as links within each section. Sourcing Components and Cost Breakdown: The LEGO Components used in LEGOLAS design were originally sourced from the LEGO® MINDSTORMS® EV3 Core (45544) and Robot Inventor (51515) sets, but can be sourced more easily from individual part retailers and via the LEGO Education website (for motors and sensors). Aluminum frame components can be acquired from MakerBeam B.V., and additional non-LEGO electronics and chemistry equipment from Amazon and other online retailers. See the detailed links for specifics on part sourcing. A cost breakdown is shown in Table 2.1 . Note that this does not include the cost of chemicals for experimentation or for a USB compatible camera, which can vary based on the application desired but typically lead to a full cost of around $1,000. Detailed Links: LEGO Parts List , Sourcing & Cost of LEGO Parts , Sourcing & Cost of Non-LEGO Parts 31 Table 2.1: Cost Breakdown for a single LEGOLAS robot Category Included Cost LEGO Parts Structural Components ~ $60.00 LEGO Electronics Motors, Force Sensors ~ $280.00 Frame Components MakerBeam Structural Parts ~ $100.00 Non-LEGO Electronics Raspberry Pis, BuildHat, Arduino, pH Sensor, Chargers, Wiring ~ $470.00 Chemistry Equipment Wells, Stand, Syringe, Dispenser Tips ~ $40.00 Total: $950 Printing Parts: Parts were printed with Ø1.75 mm Galaxy Silver Prusament Polylactic Acid (PLA) using a Fused Deposition Modeling (FDM) style Original Prusa i3 MKS3 printer ( Figure 2.3 ). With the exception of one Raspberry Pi Holder Case [87], all parts were developed in the Autodesk Fusion360 CAD software. Total PLA needed for all prints is ~410 g, for around $12.50 in material costs. Table 2.2 summarizes some relevant characteristics for each part. Information on optimal orientation, advanced printer settings, post-processing tips, as well as the CAD (.f3d) and printable (.stl, .3mf, .obj) files for each part are in the detailed links below. Detailed Links: 3D Printing Guide , Printable Files 32 Table 2.2: Summary of FDM printed parts Part Purpose Supports? Quantity Material Used (g) Cost* pH Sensor Guide Tube Guide pH sensor towards sample wells Yes 1 42.45 $1.28 pH Sensor Sleeve Protect pH Sensor No 1 11.29 $0.34 Trolley Base Hold experimental hardware No 1 38.71 $1.17 Mid-Axle Supports Support mid-spans of X-axis axles No 6 14.77 $0.45 End-Axle Supports Support ends of X-axis axles No 4 12.21 $0.37 Side Assembly Hold R-Pi & Force Sensors on Bridge Yes 1 66.42 $2.01 Reservoir Tank and Stands Contain liquid constituents Yes 1 203.41 $6.15 Force Sensor Holders Fasten force sensors to frame No 2 2.18 $0.07 R-Pi Case Hold Raspberry Pi No 1 16.51 $0.50 *Based on $30/kg PLA costs Figure 2.3: All 3D printed parts needed to build one LEGOLAS robot 33 Constructing Stand: The gantry stand is composed of MakerBeam (10mm x 10mm profile) miniature slotted aluminum extrusion pieces, 2 force sensor holders (3D printed), MakerBeam fasteners, a LEGO force sensor, and LEGO gear rack pieces (composing the y-axis of the gantry) ( Figure 2.4 ). Store bought superglue and a 2 mm hex key are all that are needed to construct it. Instructions are included in the links below. Detailed Links: Stand Instructions Figure 2.4: Fully assembled stand ( 320 x 340 x 60 mm profile ) 34 Constructing Bridge: Similar to the frame, the bridge is constructed of an aluminum MakerBeam frame with associated fasteners ( Figure 2.5 ). It also includes a LEGO force sensor, 3D printed components (4 End-Axle supports, 6 Mid-Axle Supports, 1 Side Assembly), assorted LEGO pieces, a Raspberry Pi, a BuildHAT, and LEGO rack pieces (composing the x-axis of the gantry). Like the stand, it only requires the hex key and superglue to be assembled. Detailed Links: Bridge Instructions Figure 2.5: ( left ) Assembled bridge, with X-axis ( double sided arrows ) and Y-axis manual control gear location ( rotational arrow ) shown. ( right ) 3D printed Side Assembly with Raspberry Pi + Buildhat attached in their holder piece. 35 Constructing Trolley: The trolley is built of assorted LEGO pieces, 4 LEGO motors, a Raspberry Pi, a BuildHAT, and experimental equipment (plastic syringe, Arduino pH sensor, USB camera) that all rest on the 3D Printed Trolley Base frame ( Figure 2.6 ). One must first build the Syringe Plunger Assembly, then the Syringe Holder Assembly, and finally follow the Trolley Instructions to complete construction. The only additional components needed to build it are scissors, superglue, 18-24” of braided fishing line, and 3 small barrel swivels. Detailed Links: Trolley Instructions , Syringe Plunger Assembly Instructions , Syringe Holder Assembly Instructions Figure 2.6: ( left ) Assembled trolley resting on the bridge, with Raspberry Pi + BuildHAT assembly and X-axis manual control gear location ( rotational arrow ) shown. ( right ) Syringe Plunger & Holder assembly, prior to being inserted into the Trolley Frame. 36 Sample Space: The sample space shown in Figure 2.7 is applicable to the Henderson- Hasselbalch pH study ( Chapter 3 ), and can be modified in terms of components and their orientation for other types of experiments. This setup contains the constituent reservoir tanks (3D printed and coated with an epoxy to ensure solution impermeability for acid, base, and Deionized (DI) water solutions), 48 sample wells (each Ø1.5 cm, 3.5 mL volume), and a mg accurate digital scale for volume calibration ( 2.4: System Calibration Section ). Each reservoir tank holds 50 mL of either acid or base solution, facilitating the experimental use of all 48 sample wells if needed (assuming 2 mL sample volume). These components are held in place by two adjustable MakerBeam aluminum extrusion pieces, and can easily be removed from the sample space by sliding them out through the open end for post-experiment clean-up. The sample wells and reservoir are elevated several cm off of the ground, providing ample space for the insertion of hot plates and other process-altering equipment for modular experiment design, if desired ( Chapter 5 ). Detailed Links: Chemistry Guide ( step #2 ) 37 Figure 2.7: Sample space setup for the Henderson-Hasselbalch pH study Chemistry: Preparing the solutions for the Henderson-Hasselbalch pH study involves non-toxic constituents, and produces chemicals with acidity on the level of vinegar and milk of magnesia, allowing safe classroom use even at primary school levels. Only a mg accurate scale, 10 mL graduated cylinder, small glassware components, and common chemicals are required ( see Chemistry Guide for more details ). Detailed Links: Chemistry Guide 38 2.4 Connecting to & Calibrating LEGOLAS This section describes the means of properly wiring LEGOLAS, downloading the appropriate configuration files, connecting to the robot via WiFi, and calibrating both the system and pH sensor (only needed if running acidity based studies as in Chapter 3 ). Wiring: LEGOLAS has 2 Raspberry Pi & BuildHAT stacks, with one ( R-Pi A ) located on the Side Assembly of the bridge (controls Y-axis motor, Y-axis force sensor, X-axis force sensor), and one ( R-Pi B ) located on the Trolley (controls Syringe Z-motor, Syringe Plunger motor, pH Sensor Z-motor, X-axis motor). Two 120V wall plugs must be available for the BuildHAT chargers, which are the exclusive power source for LEGOLAS and the Arduino, which connects to the pH sensor and ( R-Pi A ). A chemistry lab retort stand is used to suspend the pH sensor’s BNC cable and keep it out of LEGOLAS’s path of motion ( Figure 2.8 ). Detailed Links: Wiring Guide Figure 2.8: Location of relevant wires and electronic devices for LEGOLAS 39 Raspberry Pis & WiFi Connection: The Raspberry Pi’s + BuildHAT stack must have the Raspberry Pi operating system installed, as well as packages for automatic remote SSH connection (RPyC Server) and BuildHAT interfacing. A WiFi connection (from your Computer to LEGOLAS) may be enabled through a local router to facilitate DHCP Client List observations and Address Reservations. Raspberry Pi configuration can also be copied from one microSD card to another using a USB microSD card reader, for expediting the process when setting up the second Raspberry Pi + BuildHAT stack ( see Raspberry Pi & BuildHAT Setup Guide for more details on proper setup ) Detailed Links: Raspberry Pi & BuildHAT Setup Guide File Set-up: The preferred LEGOLAS programming environment uses Jupyter Notebooks through the Anaconda platform. Files for calibration (manual.py, config.py), movement functions (core.py), demonstration (LegolasDemo.ipynb), and classroom use (LegolasOutline.ipynb) can be downloaded from the LEGOLAS github page ( see LEGOLAS Scripts ). Instructions for directory management and use are in the beginning of the LEGOLAS Calibration & Use Guide. Detailed Links: LEGOLAS Scripts , LEGOLAS Calibration & Use Guide ( pages 1-7 ) System Calibration: Prior to using LEGOLAS for experiments, one must calibrate the location of relevant experimental locations (reservoirs, sample wells, DI water tank, syringe/pH sensor device offset) in cartesian space relative to a user set origin point (which is defined by an XY offset from the X-axis and Y-axis force sensors). If the LEGOLAS trolley or bridge is then derailed, it may reestablish these locations relative to that origin point. Once XY positions are 40 calibrated, the user must define the range of motion (in the Z-direction) for the pH sensor and syringe, as well as their distance of offset in the XY plane. Finally, the liquid-volume/gear-step ratio is defined via calibration with a mg accurate scale. One may use keyboard controls or the manual control gears to perform calibration steps. All aforementioned steps use a GUI that reduces calibration time to ~10 minutes ( Figure 2.9 ). The calibration values are exported from the GUI into config.yaml ( to be called in the Jupyter Notebooks ). Additionally, the pH sensor should be calibrated ( see Arduino pH Calibration Code ) prior to experiment, which can be carried out with installation of the Arduino IDE and a 2-point buffer solution calibration. To extend pH sensor bulb life-time, it is advised to avoid long periods of exposure (> 1 hr) outside the pH storage solution (1 M KCl). Detailed Links: LEGOLAS Calibration & Use Guide , Arduino pH Calibration Code Figure 2.9: ( left ) GUI calibration window, and ( right ) liquid-volume/gear-step calibration using the mg accurate digital scale. 41 Coding Interface: Once the system is physically calibrated, one can connect to verify calibration of LEGOLAS using the LegolasDemo.ipynb file in Jupyter Notebooks ( Figure 2.10 ). Students may also become comfortable with the motor movement functions here, before moving onto LegolasOutline.ipynb ( which contains the worksheet problems in Markdown text ) for completion of the assigned challenges. Installation and usage of common ML packages (such as GPy and scikit-learn) in this Python environment allows students to visualize the process flow of AE in one central location. Detailed Links: LegolasDemo.ipynb , LegolasOutline.ipynb Figure 2.10: ( top ) Example usage of fundamental movement functions ( contained in core.py ) and ( bottom ) example of a simple loop for experimental synthesis and measurement in a 4x6 grid of sample wells (no active learning involved). These snippets are included in the LegolasDemo.ipynb file. 42 2.5 Developmental Stages This section visually depicts the 4 generations of LEGOLAS, in which mechanical design, interface methods, and reliability were iteratively improved based on student feedback and for the purpose of reducing material costs ( Figures 2.11- 2.17 ) Figure 2.11: Liquid Handling robot that inspired the LEGOLAS design [85]. This Robot was designed using LEGO® MINDSTORM® EV3 robotic and frame components, and was capable of liquid dispensing and color detection (using LEGO color sensors). 43 Figure 2.12: LEGOLAS ( Generation I ) using a Mindstorm EV3 Brick. This model was accessible for control via WiFi or Bluetooth, and now included a sturdier aluminum frame, as well as a pH sensor for acidity measurements. 44 Figure 2.13: LEGOLAS ( Generation II ) using the LEGO® Robot Inventor Hub. This model was accessible for control via Bluetooth only, and was created to utilize LEGO’s newest robotic kit as they phased out production of the MINDSTORM® EV3 sets. The Inventor hub acted only as a microcontroller, providing limited functionality compared with the EV3 kit. 45 Figure 2.14: Classroom Setup for Fall ‘21 Implementation, which used Generation I & II LEGOLAS. 46 Figure 2.15: LEGOLAS ( Generation III ) using integrated Raspberry Pi & BuildHATs to interface with LEGO motors. This model was accessible for control through a SSH via WiFi. This design featured new 3D printed axle supports and greater mechanical stability. It was utilized to conduct the first live runs of the on-the-fly hypothesis validation active learning schema. The Raspberry Pi & Buildhat interface allowed the same programming freedom available with the prior MINDSTORM® EV3 kits. 47 Figure 2.16: (top) LEGOLAS (Generation IV) used in the Fall ‘22 UMD course and (bottom) an image of the robot with a black frame (MakerBeam extrusions). This LEGOLAS had more 3D printed components (trolley base, pH sensor lowering assembly, and side cart) that reduced the overall cost of the robot while improving its stability and robustness. Generation IV LEGOLAS could also be equipped with a camera for computer vision based AE (Chapter 5). 48 Figure 2.17: Students working on the LEGOLAS exercises during the Fall ‘22 Implementation, which utilized Generation IV LEGOLAS. 49 Chapter 3: Educational Implementation 3.1 Henderson-Hasselbalch Exercise The Henderson-Hasselbalch exercise was selected as a simple example for the teaching of AE fundamentals in the Machine Learning for Materials Science course at the University of Maryland in the Fall ‘21 and Fall ‘22 semesters as a final project. Students had spent the initial 12 weeks of the class overviewing ML techniques (regression, classification, supervised learning, unsupervised learning, etc.), and had just been introduced to active learning about 1 week prior to the start of the project. The selected exercise was intentionally simple - being that it contains only one control variable (mixture ratio of acid and base solutions) and 1 response variable (acidity or pH) - so that fundamental concepts could be discussed in conjunction with hands-on implementation. It is fundamentally a chemistry problem, but the ideas of composition-property relationships can be extended into materials science based problems naturally. Although the physical system is simple, layers of complexity can be added with the use of novel active learning techniques (Gaussian Processes, Bayesian Inference, and Entropy Based Acquisition functions) without a related increase in system price. Additionally, the chemical constituents are non-toxic, and safe to work with to eliminate the risk of injury to students. The Henderson-Hasselbalch equation is a simplified and rearranged mass-action equation that relates the pH of a buffer solution to the relative concentration of an acid and its conjugate base ( Equation 3.1, Figure 3.1 ). (Eq. 3.1) [88] = + 10 [ −][ ]( ) [ −] = ., [ ] = ., = 50 Figure 3.1: The Henderson-Hasselbalch equation for the system of study as expressed in a Titration Curve. The trend here does not appear exactly logarithmic due to the scale of the x-axis ( not in units of [base]/[acid] )[89] The equation relies on several assumptions, namely that the acid is monobasic, self-ionization of water may be ignored, the salt base completely dissociates in solution, and the activity coefficient quotient remains constant within experimental conditions [88]. These assumptions are reliably met with buffer solutions of sodium acetate salt (NaOAc) and acetic acid (CH 3 COOH) each prepared at 1M concentrations (acetic acid pK a = 4.756), although the Henderson-Hasselbalch equation begins to deviate from experimental results at high acid compositions due to the violation of certain assumptions in the simplification ( Figure 3.2 ) [88]. This deviation from expected behavior is later used as a teaching example of the benefits of non-parameterized models such as Gaussian Processes in materials exploration. 51 Figure 3.2: Percent error in Henderson-Hasselbalch equation as a function of sodium hydroxide volume per 100 mL of weak acid ( see pK a = 5 as a sufficient analogue to our system with pK a = 4.756 ). The red region indicates solutions of higher acid concentration in which the pH does not correspond well with the Henderson-Hasselbalch equation [88] The Henderson-Hasselbalch equation represents the functional form of interest in the mechanistic study of the composition-property relationship. In using this example, the key idea to be transferred to students is that the exploration of the acidity as a function of composition could aid in accelerating the exploitation of a desired pH value ( Section 3.1.2 ). In the exercises described in the next two sections, students become familiar with manipulation of LEGOLAS robotics for experimentation, quantify acidity measurement uncertainty, extract a meaningful physical constant of the system (dissociation constant) by fitting a Henderson-Hasselbach equation to the data with non-linear least squares regression, and finally execute a closed-loop Bayesian Optimization based AE by developing a scheduled acquisition function and running the live experiment themselves. Each exercise is intended to build upon skills developed in previous exercises, and to leverage their knowledge of concepts learned in the Machine Learning for Materials Science course. The implementation of these exercises - as well as the team-based 52 nature of the project - are fundamental to the transferability aspect of LEGOLAS outlined in Section 2.1. This is further discussed in Section 3.2.3 . Worksheets and Python templates for all exercises may be found at the links below. Detailed Links: LEGOLAS System Exercises , LegolasOutline.ipynb 3.1.1 Introductory Exercises Prior to completing any of the exercises, the students were encouraged to master the system calibration tasks, and peruse LegolasDemo.ipynb, to familiarize themselves with how the robot could be operated. This was a critical step in addressing skills (1), (2), and (3) of transferability: working with their hands on hardware, interfacing with robotics/motors, and demonstrating attention to detail when calibrating the machine. During the implementation of the LEGOLAS as the final project for the machine learning class, it was noticed that groups that tried to forge ahead too quickly to complete exercises without devoting time to these critical steps had much trouble later when trying to run closed-loop AE runs. Exercise 1: This task asked the students to synthesize one 2 mL sample (50% acid, 50% base), and then measure the pH 10 separate times, cleaning the probe with DI water between each iteration. From these measurements, they were asked to calculate the mean and variance of the pH and present this in the notebook. Completion of this assignment ensured that the group was capable of running the constituent acquisition & deposition functions (by creating the sample), had properly calibrated the liquid-volume/gear ratio (by creating exactly a 2 mL sample), had properly calibrated the pH sensor (by verifying ), and were giving enough time for = 4 . 75 the pH measurement to stabilize (~20 sec, for a ). The peculiarities of the measurement σ 2 < 0 . 1 53 and synthesis system presented here are helpful in addressing aspects (1), (2), and (3) of transferability. It was also important for the students to get some idea of the magnitude of the noise factor in their measurements as they continued to approach the other exercises. Exercise 2: In this task, the students were given the functional form of the Henderson-Hasselbalch equation, as well as descriptions of the variables therein. They were asked to synthesize a grid of compositions ( %acid = [10, 20, … 80], %base = 100 - %acid ), taking pH measurements between each synthesis step (cleaning the probe with DI water again as well). Students were then asked to use a nonlinear least squares protocol to fit Equation 3.2 to the collected data for the purpose of extracting a pK value ( ideally, pK = 4.75 ): (Eq. 3.2) ( , ) = + 10 ( ) = [ ] / [ ] = % /% While this exercise again required utilizing skills (1), (2), and (3) of transferability, it was specifically suited towards building upon skills (4) and (5); utilizing knowledge of ML techniques, and programming for data manipulation. Using non-linear least squares regression is a critical tool for any scientist dealing with data processing, and - in this example - was shown to produce the “discovery” of a physical constant inherent to the acetic acid buffer system. From the programming perspective, students had to arrange incoming data into an array of values, define a function for optimization, and research the specific nonlinear least squares regression documentation to understand the parameters and returns. It was observed that a small group of students did not understand the reasoning behind alternating between synthesis and measurement (to prepare for an active learning style study) when they could just as easily have prepared all the samples prior to measurement. However, after a discussion, they began to understand the 54 differences between the format of a grid-like exhaustive study as shown here, and the proposed format of a closed-loop AE study. This was a subtle point of misunderstanding that went apparently undetected during the purely theoretical data-science active learning sessions of the earlier weeks ( i.e., without the hands-on application ). Exercise 3: In this task, students were introduced to the use of a Gaussian Process and the importance of its kernel and model hyperparameters. There was no experimental use of LEGOLAS within this exercise, as it used the data collected from the grid-like study of Exercise 2 . Students were asked to use a Radial Basis Function (RBF) kernel, and output the Gaussian Process hyperparameters (kernel length scale, kernel variance, noise variance) after several iterations of a hyperparameter optimizer. Next, they were asked to repeat this process, but now using only 6 data points ( %acid = [10, 20, … 60] ). As mentioned in Section 3.1 , the assumptions of the Henderson-Hasselbach equation are valid only within a certain composition range, begetting this limitation of relevant data points ( Figure 3.3 ). Students were observed to ask questions about this limitation, leading to further discussions that helped solidify understanding of the flexibility of a Gaussian Process over a rigid parameterized functional form as seen in Challenge 2 . This discussion briefly addressed aspect (9) of transferability; the discernment to understand pros and cons of AE. 55 Figure 3.3: Gaussian Process (RBF Kernel) demonstrated for 5 samples within a composition range of %acid = [5,.. 95]. The 95% Confidence Interval of the GP is represented in light blue, with the mean as a dark blue line. It can be seen that the data points (x’s) deviate from the expected Henderson-Hasselbach equation (dotted line) at high acid compositions (red region). In the second portion of Exercise 3, students went further into aspect (9) of transferability, by purposefully implementing bad model assumptions for the Gaussian Process, and analyzing the effects on model outputs. They were asked to rigidly fix certain hyperparameter values - such as length scale - at values that were either too low or too high to see that this could either “underfit” or “overfit” the data. Next, they were asked to use an inappropriate and less flexible kernel (standard periodic) to see that this imposed an (incorrect) periodic form to the Gaussian Process model. Through this type of study, students began to see that these data processing and active learning tools are not necessarily always applicable off-the-shelf, and could sometimes require human insight or tuning to be of real service. Throughout this exercise, students were also asked to plot the Gaussian Process (mean, 2 σ Confidence Interval), idealized Henderson-Hasselbalch model (for reference), and experimental 56 data ( Figure 3.4 for example graphs ). This ensured that they were also working on skill (8) of transferability; presenting data in a digestible & meaningful fashion. 3.1.2 Autonomous Closed Loop Exercise The Autonomous Closed Loop exercise was the pinnacle achievement for the students and was intended to usurp the most time & thought on their part. It was designed to combine all 9 critical aspects of transferability, with emphasis on (6), (7), and (9): the insight to encapsulate research “objectives” into active learning schema, the foresight and knowledge to simulate live-runs, and the discernment to understand AE pros & cons. Students who did not have a good grasp of the theoretical fundamentals of active learning (from the prior classwork) struggled to understand the significance and development of acquisition functions. Many students also had a hard time differentiating between what they knew about the composition-property relationships (because they were given the Henderson-Hasselbalch equation) and what the robot knew (because in this toy problem we were assuming it began uninformed) [66]. This was surprising, as these same students had succeeded in completing simulated active learning assignments earlier in the course. However, it highlighted the most beneficial aspect of LEGOLAS: AE as a field cannot be taught reliably and effectively without a real-world tangible system for experimental application . The majority of these same students that struggled to understand the benefits of active learning and AE were then capable - after successfully completing Exercise 4 - of clearly elucidating their thought processes and acquisition function implementation while presenting their final project results. Exercise 4: In this task, the students were able to use the Gaussian Process understanding and code they developed in Exercise 3 within a closed-loop AE process framework to autonomously 57 determine the composition (mixture ratio) at which the pH = 4.75. As stated earlier, this led to some confusion among students, as they already knew what composition this occurred at (50% acid, 50% base), and they had already collected enough data in Exercises 1-3 to reliably extract this. It was also confusing to students that this was indeed an exploitation based CO, because the property value they were targeting was neither a maximum nor minimum acidity metric. The students were told to discuss the merits of different acquisition function schemas, and encouraged to combine exploitation and exploration COs in some manner. This was a reiterative process in which students would come to the TA’s for guidance. Collectively, the class would first talk through what it meant to explore (looking towards the unknown ), what it meant to exploit (where do we think our target is considering what we know?), and what would be an appropriate order in which to execute those initiatives. The drawbacks of exploiting from the start (i.e., “shooting in the dark”) and purely exploring (effectively a grid search) were discussed. A question that often came up was why we were doing this in the first place. (The goal is to reduce the number of experiments). Of course, a toy problem like the Henderson-Hasselbalch equation seemed to not need this reduction in experiments, but the discussion of the merit of AE in this case was a valuable expression of aspect (9) of transferability; discerning between AE pros and cons. Once the theoretical and philosophical discussions of the CO order were understood, the students again came for guidance with respect to the mathematical encapsulation of these human ideas into quantitative acquisition functions. This discussion, along with the prior philosophical one, embodied aspect (6) of transferability most directly; the insight to encapsulate research “objectives” into active learning schema. Some groups decided on a hardcoded cutoff between a purely exploration based CO ( Equation 3.3 ) and purely exploitation based CO ( Equation 3.4 ). 58 This could, for example, consist of doing 4 experiments exploring, then spending the remainder of the experiments exploiting. Other groups understood more fundamentally that although this approach could work, it fundamentally depended on our prior knowledge of the simplicity of the system for picking this cutoff point. Those groups, instead, utilized their knowledge of other techniques - namely, a modified upper confidence bound (UCB) - to have an adaptable and continuous transition between these COs ( Equation 3.5 )[90]. Certain aspects of UCB tuning led to further discussions, broadening the students' understanding of the potential data research opportunities within the acquisition function based side of AE. (Eq. 3.3) = σ 2 ⎡⎢⎣ ⎤⎥⎦ (Eq. 3.4) = − | μ − 4 . 75| [ ] (Eq. 3.5) [90] = (− | μ − 4 . 75| ) + σ 1 ( )⎡⎢⎣ ⎤⎥⎦ = , 1 = In the actual implementation of Challenge 4, the students also began to understand that they needed an exit criterion at which the AE loop would terminate. Upon further discussions, it became clear that this could introduce even more complexity to their AE loop. If, by chance, they serendipitously selected a composition of 50% acid, 50% base (with a pH sufficiently close to 4.75) during their exploration initiative portion, would that constitute success? Should they verify their result? Was it important to continue searching the composition space for other potential compositions that might produce this specific acidity? No singular answer to these questions was identified as correct, and the discussion itself was the teaching point. Without a tangible experimental system such as LEGOLAS, pertinent questions like this are easily overlooked . 59 As mentioned before, some groups who did not give much attention to calibration ( aspect (3) ) either had hardware malfunctions with the robot (they are not supposed to touch or assist the robot during the final closed-loop study), or erroneous pH metrics that led to incorrect answers. Another critical aspect of designing their final closed-loop AE was to leverage the utility of simulated runs. Being that the synthesis/measurement/analysis loop could take an average of 90 seconds, groups that attempted to troubleshoot their active learning code during live runs could become extremely frustrated and get far behind others due to the time they were spending resetting the experiment. Those that followed the advice of using simulated experimental data (i.e. a Henderson-Hasselbalch function with some random white noise applied) refined their active learning code, and once they knew it worked, were able to confidently proceed with experiment ( aspect (7) of transferability ). An example of a live-run evolution of the Gaussian Process for a purely exploration based CO ( Equation 3.3 ) is shown in Figure 3.4, for samples 1 through 5. The Gaussian Process can be seen to approach the logarithmic behavior of the Henderson-Hasselbalch equation, as the acquisition function prioritizes compositions with high degrees of model uncertainty. 60 Figure 3.4: Gaussian Process evolution (RBF Kernel) for first 5 experimental samples with a purely exploration based acquisition function. 61 3.2 Class Implementations This section provides a summary of the implementations of LEGOLAS within the Machine Learning for Materials Science course in the Fall ‘21 and Fall ‘22 semesters. This class ( 3 credits ) is listed on the University of Maryland Schedule of classes as ENMA437 ( previously ENMA 489L ) for undergraduates, and ENMA637 for graduate students. It has been offered once a year in the Fall semester since 2020. The enrollment has on average been around 12-16 students, typically half graduate and half undergraduate students. I have served as the main teaching fellow for the class in 2021 and 2022, when LEGOLAS was fully implemented as the final project tool. Prior to the Fall 2021 offering of the class, I had spent the summer developing LEGOLAS. Course Description: Familiarizes students with basic as well as state of the art knowledge of machine learning and its applications to materials science and engineering. Covers the range of machine learning topics with applications including feature identification and extraction, determining predictive descriptors, uncertainty analysis, and identifying the most informative experiment to perform next. One focus of the class is to build the skills necessary for developing an autonomous materials research system, where machine learning controls experiment design, execution, and analysis in a closed-loop. The LEGOLAS Exercises were designated as the final project for the class, and students were split up into teams of 4-6 per robot. They were given the last 3-4 weeks of the in-period class time ( 3 hrs/week ) to work on the project, and allowed extra lab time if they felt they needed it. Credit was given for each exercise either through TA observations of completion, or through a video taken by group members ( see LEGOLAS rubric in Detailed Links ). Additionally, students prepared a final presentation (20 minutes) in which they overviewed their completion of 62 the Exercises, elucidated their decision making processes, highlighted the difficulties they had, and offered suggestions for course improvements. Detailed Links: LEGOLAS Rubric 3.2.1 Fall 2021 In the Fall of 2021, the class utilized Generation I & II LEGOLAS systems ( Figures 2.12 - 2.14 ). There were 3 teams of 6 students, w