Wednesday, November 26, 2008

Open Notebook Science in 15 minutes

Open Notebook Science in 15 minutes

Jean-Claude Bradley: All right. So, I will try to explain to you these two concepts of the synthesis of anti-malarial compounds and Open Notebook Science in the next 15 minutes. Well, this is actually a pretty good time to give this talk. This week we actually got our Wikipedia entry for Open Notebook Science. And it turns out that it required a lot of peoples coordinate efforts. And it required a body of work before we were able to do this. But, if you go on Wikipedia, you can learn a lot more that I don't have time to explain today.

The idea of Open Notebook Science is basically to report the work that you do in the laboratory in real time or as close as you can to real time, so that the entire world knows as much as you do about your research. Like I said, there are a number of references here that you can take a look at the background of this. But, the motivation is that - well, it should be self evident that it's a way to do faster science compared to either not disclosing some things or significantly delaying them.

And I think, it's also a way of doing better science, which is not immediately obvious, but hopefully I will show you some examples of how that can be.

OK, now to the synthesis part. So, we are a synthetic organic chemistry group and our target is malaria, specifically Falcipain-2. Malaria, as you should know, is a disease that is spread by mosquitoes. And here's actually the malarial parasite inside of the red blood cell. And it uses the enzyme Falcipain-2 to metabolize hemoglobin. So, if we can inhibit that enzyme, it could be a way to basically stop the process of it replicating.

And so, what we have done is we have collaborated with a group at the Indiana University, Rajarshi Guha. And he does the docking, which basically means that he takes the Falcipain-2 in the computer and tries to dock molecules and see if they fit or not. If they fit, there's a chance that it might inhibit it. And so, he tells us which compound to make. And then we make them and we ship them off to UCSF where Phil Rosenthal does the testing.

So, this is a collaboration done completely openly, and people can join in or they can follow what we are doing well before the publications come out. So, I am not going to talk too much about the nuts and bolts of it, but, suffice to say that we use blogs, wikis and all these different social networking sites to try to make this system fully hosted and fully replicatable by anyone else in the world who might want to do a similar thing. And that's happened. And I can surely talk to you if you are interested in seeing the different groups that have done that.

I was telling you about it's a way of doing better science and really comes down to where's the beef when you talk about your experiments. So, this is a blog post here, where we are talking about doing different things. And it says "See experiment 150 for more information." So, this is Ugi reaction that I will be mentioning over and over in this talk.

And if you click on that link, it takes you to the lab notebook page experiment 150. And this actually looks very similar to what it would look like in a paper notebook. And that's on purpose. We wanted to make things easy as possible for people to get involved with Open Notebook Science.

So, you have an objective, and you have all these different hyperlinks. So, one of the things that you can link to - and then this is a pretty long page, I am just going to skip through it giving you examples. You can hit that Ugi edit link and it takes you to an entry in ChemSpider. ChemSpider is a free database. It has over 21 million compounds. You can do such sorts of searching. You can do all kinds of things for free. And I don't have to worry about that on my server.

So, that's what we are trying to achieve here. We are trying to get high quality information processing without having to become computer scientists to do it. And it's becoming really possible to do.

We also link to the docking procedure that our collaborator Rajarshi uses. Again, here the idea is that this is replicatable. Someone who has done docking before should be able to get enough information from this page to generate the same compounds in the same order; all right? So, these are called SMILES codes and they are convenient ways of representing molecular structures, and you can just dump them in spreadsheets. So, it's a pretty convenient way.

Again, this is all made explicit, so you don't have to ask the researcher for permission. You can just go and look at the results.

Another very helpful thing is our spectra. If you know anything about organic chemistry you know that the basis of it is spectra, especially NMR spectra. And there's actually a very neat way - if you have your NMR spectrum in a JCAMP format, you can run JSPEC view so that someone who does not know anything about the Java or anything, just hits this link and this spectrum pops up, and it's actually interactive.

So, you can use your mouse and drag across any peak and it will expand. Again, here - this is what I am talking about doing better science, you know. May be, you didn't expand that peak in your paper. May be you didn't talk about it. But, if I am trying to replicate this where I am trying to extend your research, maybe I am interested in that peak. Maybe I want to measure it. And so there are just more details.

So, by the time we end up with the final conclusions and it's says "This Ugi product was within 59% yield." You don't have to take our word for it. It's all backed up - either well or poorly - but it's all backed up, exactly what's supporting our statement.

If you are not familiar with the wiki, the reason that we use it for a live notebook is that every time there's a change made, it tells you who made the change and exactly when. And we have a third party time stamp for it. So, we can claim that we knew what we knew exactly when. And we are not running the time stamp. It's run on a third party that's well respected. So, that could be interesting down the road to settle claims.

We can compare any two versions, and using wiki spaces it lets you - basically shows you the stuff in green is the stuff that was added, and the stuff in red was deleted. So, it's a really nice way to understand what people are good at, right? Because this is a collaboration, many people in the lab working together, certain people are good at some thing and other people are good at other things. And this is a really good way to keep a track of all that.

Now, to find information, that's actually a big issue. Obliviously, if we just left it in the wiki like that - I mean we have tags. We have ways for searching for the information. But, you don't want to have to do that if you are interested in seeing the collection of experiments that we have run.

So, we've run this Ugi reaction several times and we have modified the conditions. So, we have used different staring materials, different solvent amounts, and different concentrations. And we have sometimes gotten a nice precipitate that was pure product and sometimes we don't. So, we are trying to understand that. And we are using these Google docs as a way of sharing that information in a very convenient way.

So, this is a spreadsheet. It works very similar to Excel but it's free and it's hosted. So, I will show you an example of an opportunity we had recently to use a robot from Mettler-Toledo. And we are able to actually automate this optimization of this reaction. This was done in collaboration with Dr. Owens. He did some statistical analysis, which I want have time to get into. But, the idea here is that we wanted to find the highest yields - the condition for the highest yields.

So, we modify concentration, we modify the solvent, and we modify the excess of some of the reagents. So, we actually did these reactions in little tubes that had a filter at the tip. So, the robot added the four different components. And then it precipitated or it didn't. And if did. We just washed it and then weighed the results. And of course took an NMR to make sure that we actually got the compound.

So, this is a picture of the robot. And it's basically just a syringe that goes and takes the liquid out and puts them. An interesting thing about using a robot is that you get automatically the log of what the robot did. And it pays attention all the time. So, it will record what it is it think it did. That's a double edged sword. It gives you a lot of information. If you want to debug things, yeah, you absolutely have some good data to look at.

But, it also means because you are able to do so many more experiments, you have to be even more vigilant about systematic errors. And we've had that problem. And so, you end up doing a thousand experiments before you find the problem, all right.

But, once you get it working, actually, this can be extremely useful. So, just to go to the final results here. So, we did these experiments and we had enough material to publish a paper. So, here's another use of the wiki where we actually wrote the paper in the wiki. So, every single draft was saved. And we can go back and see exactly how the paper was written.

And the really nice thing about having a notebook to point to is... See, I can have reference nine to 11 be the melting point of the compound, and I can specify the batch that it was taken from - from experiment 99, whereas the proton NMR was taken from experiment 203, sample A 11. So, that information is typically not part of a typical publication. You assume that the guy knows what he is doing and that he actually characterizes his compounds properly. While that is not always the case as we find out painfully. So here we can actually go and see if there is a problem with the specific batch if we are not getting the same information.

Now, where we actually submitted this paper is kind of interesting. It's called the Journal of Visualized Experiments, JoVE. So, there is a written part to this that I just showed you; that is what we wrote on the wiki. And they actually sent some camera people to record our experiments. And so, this is now under peer review. And we should hear back shortly about this. And I don't see any problems and I don't expect any problems.

So, this will be a nice way to communicate with video as well. So, there are so many tools now that make communicating your science faster without losing anything. Another thing that the physicists have been using for a while is pre-print servers. So, chemistry really didn't have a good pre-print server - well, they did, but that's a whole other story; it's no longer working. So, Nature actually recently came up with this Nature Precedings, which is a pre-print server and it's backed with the editorial filter of the Nature Publishing Group. If you are not familiar with Nature, it's one of the most well-respected publishers out there.

So, if they basically say that this has good scientific quality, it's probably true. And so, we can before publication in JoVE or any peer review journal that we choose to publish in, we can actually link to this document. People can comment on this document. They can vote on it. They can give us feedback. You can have versioning on here. All kinds of things you can do.

Normally, we have a paper out, you just tell people "Well, it's going to come out next week," and when it does "Here's the link." Well here, now, you can actually give the link and you can have [inaudible 10:39].

So, the bottom line here is we did find a maximum yield - 66%. We went in with a yield of about 49 to 50% - we got some increase. But, the major result of this was really to prove that we could optimize the reactions in robotics.

Now, so far as the malaria project, that's actually important because that's how we make our compounds with Ugi reaction. Recently, we've actually gotten some results about this. We have four compounds that actually are active in inhibiting the enzymes, and they are also effective in inhibiting the infection of plasmodium falciparum. And these are in the micromolar range. So, it's not bad. I mean, it's definitely publishable stuff.

And there are different stories here. We used one receptor area on the enzyme here. We used another receptor here. I don't have time to get into it, but it's kind of interesting the results that are coming out of this. And again, this is out into the open. And we never know who is going to stop by and collaborate.

A last little story. I recently did a little trip in the UK. And my friend here Cameron Neylon who also does open notebook science - although, he uses a different system than I do, we had the chance to spend a day in the lab to do experiments. And one of the things that evolved from my trip is a very simple project using open notebooks. And we spent the day measuring solubilities. So, we took a bunch of compounds and we took a bunch of organic solvents, and measured the solubilities. And then, we reported these solubilities in a Google doc.

Now, this is actually very interesting. So, for Boc-glycine and methanol, we are measuring 4.4 molar. And you notice that that's in green. And down here, for D-glucose and methanol, we do get a number, right, and it's 0.05. But, I put it in red and I don't actually include that number in my final results, because I am not satisfied that I am going to stand behind these. I don't think that 1.8 milligrams in the way that we were measuring it is good enough to report this.

But, what if you want a ballpark estimate? You can still access my number and you have all the details of the context in which it was taken. So, again, that's better science, I think. And what we are trying to do with this project - it's actually related to the malarial project in the sense that we can measure solubilities, report them publicly, and then build models; and Rajarshi Guha is going to help us build models of solubility - we should be able to predict the yields of these Ugi reactions in different solvents.

So, the idea is, for this Ugi product, you should do it in 51% methanol, 4% ethanol and the rest is acetonitrile. So, that will be a very powerful thing that can be used not just for our project, but really anyone could. And sort of to get this ball rolling, I set up this Open Notebook Science Challenge. And what it is, it is essentially we are asking people from around the world to contribute their measurements so long as they link them to a well maintained notebook. And if they do that then we can use these results, and we can publish with them, and we can do everything that we do as scientists.

And we have a sponsor. Aldrich's is actually volunteered to ship compounds anywhere in the world to encourage people to do this. So, I am very excited about this. It's a new initiative. And I think it has a good chance of working.

And there are so many people to thank here. Khalid is my grad student. Kevin Owens you just heard from. Tim Bohinsksy is an undergrad who just started to working in my lab, his term measuring solubilities. James is also an undergrad. Tom Osborne is the Mettler-Toledo rep who was very patient and took a lot of time to bring us the robot for us to get these results.

Antony Williams is the guy who runs the ChemSpider, the database that I showed you for molecules. Andrew Lang actually put our results into Second Life. Because of the briefness of this talk, I wasn't able to get into that. But, you can visualize the optimization of the reaction using 3D plastic. You can rotate it in Second Life. So, Andy did that. And of course, Cameron from Southampton.

So, that's it. Any questions?

Labels:

Monday, November 24, 2008

iSchool Open Notebook Science talk

iSchool Open Notebook Science Talk

Jean-Claude Bradley: I would like to tell you today about Open Notebook Science. My talk is based on the work we did in chemistry in terms of making anti-malarial compounds and measuring solubilities. But, I have actually put this talk together in a way to sort of minimize the chemistry and focus more on the IT aspects of it. Hopefully, by the end of it, you will understand pretty well what it is we are doing.

This really comes as, there are several themes that have been emerging the past few years in science and in teaching and one of them is actually openness. There are a few people here who are doing more open teaching in terms of recording their lectures and making them open on iTunes. All of these things are progressing.

The same thing is happening in research. We are going from a world where we have a traditional lab notebook, which is unpublished and will never see the light of day unless somebody writes a paper about it and going to more and more open forums. Traditional journal articles are more open, but people have to pay for it, and so it's limited to only certain a sub-set of people.

Recently, people are talking about open access journal articles. Again, that's more open, because the articles are free to access; however, generally the authors have to pay for the cost, so it's not generally a totally free deal.

At the end of the spectrum, we have Open Notebook Science. The idea of that is total transparency in the research process. We want to make available the actual lab notebooks of the students in real time or as close to real time as possible for the world to see so that's what I'll be talking to you about.

Now, my job is a little bit easier as of the past couple of weeks because we now have our Open Notebook Science entry in Wikipedia. If you're more interested in looking at some references, we've got some things coming out in Cell, CD News and Nature. So, with this kind of accumulating scholarship, it actually is starting to really take shape. I'll tell you about some of the people, besides myself, who are involved in this.

Again, I am going to start from the IT perspective instead of the chemistry. The way that I always like to talk about this, which of this happens to be usually my last line for the chemists, is we're moving from a world where we have human-to-human communication. That's what science has really been from the very beginning, one scientist telling another scientist what they did or trying to avoid what the other scientist did.

Right now, I think we are in a very interesting phase where we are starting to have humans communicating with machines and back and forth in terms of scientific information. Eventually, I think that the whole scientific process will get done by machines talking to machines, but I think in order to get to that point we have to go through this period where we have to somehow find a compromise so that both humans and machines can access the same information and talk to each other.

That comes down to what is the information that chemists actually manipulate. There's this concept of what's a fact. What's true? What's false? If we find a number, how sure can we be that it is close to the true value? We have to remember and students often forget this, that there really are no facts they're just measurements embedded within assumptions. The problem with the way that chemistry, especially, is being currently communicated even in traditional journals is that those assumptions are not made explicit. It is very difficult to tell exactly what was done or what the author actually did or didn't do.

Open Notebook Science maintains the integrity of the data providence from the lab notebook all the way to whatever documents happen to come out of that. So, if somebody wants to question a number, they can just click through and have access to the original lab notebook page.

Here, we are moving from a concept of trust and there is actually a lot of trust today in science. If someone that you know writes a paper, you are more likely to think that it's correct or more likely to be more valued than if someone else writes it, or if an article comes out in a certain journal you may give it more credibility than in another.

We have that in all the three fields. Chemistry has its own journals where if there's a yield that shows up, you know that you can't trust it because it's a certain journal, which I won't say which it is, but as chemists, we all know those. That's really based on trust. I think that if we start to move away from trust and move to just simply providing proof, if you provide sufficient proof you don't need trust anymore because the machine or the person has to back up everything that they're claiming.

Let me give you a very specific example. Here, we are looking at the solubility of 4-chlorobenzolvide. Solubility is just how much of a compound goes into solution, very, very simple chemical concept. It should be something fairly easy to answer. You should just basically look it up in literature. It turns out that a lot of these measurements have not been done surprisingly. Very, very simple things but you can't access them.

It should be a very simple thing. Give a couple of under grads some compounds and a scale and just let them go. Well, it's not that simple. If you look at the values that our students have been obtaining, you'll see here a high number five to one to 3, and then there's a number here 0.07. That number was collected along with all the other numbers since we're just reporting results.

As a chemist I looked at that and it didn't make any sense at all because these compounds are actually pretty similar chemically. It would be extremely surprising for them to have very different properties. Either, we discovered an extremely interesting phenomenon or, as is more likely the case, there is a problem with the way the measurement was done.

That particular measurement, by looking at the lab notebook, I'm going to actually look at the specific experiment for a bunch of slides and show you that we were able to uncover the fact that this compound was operating in the speed back, the machine that basically lets us find out how much is dissolved.

We redid the experiment and now we get a value of 3.6 moles. That number makes more sense so that number is validated according to what I think based on what I saw of the experiment where the first number is rejected. In the literature, you will not get to dig down into the original proof for this.

Let's take a look at what is actually in the lab notebook page. We're missing a little bit of the screen here. Up here, it says log, so this is the log section of the lab notebook. It basically has just a sequence of what the students did and what they observed. That's how you are supposed to keep a lab notebook in organic chemistry.

I have highlighted a couple of things here. Unfortunately, there's a section missing and the section that is missing actually said did not measure the amount of time vortexed. The lab notebook just doesn't tell me the stuff that the students did. It tells me what they didn't do whereas if I were using a trust-based system I would assume: oh, this person probably measured the time they did this. They probably measured the pressure, but unless you actually check, you actually can't know that.

Down here is the same thing. There are actually times here that are missing from the screen. I can actually see how long they put it in and what they did or didn't measure.

The other thing that we can actually drill down to is the rationale of the findings. We can make those explicit. This is actually a discussion and conclusion section off that same page, and it basically talks about the data. It talks about the raw data. Some of this, I wrote. My student wrote another part. Someone else might have come in and actually added to it. It explains the rationale of why we think that 0.07 number is incorrect.

You can look at the raw data, but you can also look at the rationale of the scientist. Again, this is not typically provided in a paper because this is actually just developing. You can also look at the actual documents that are offered up as proof.

Here, we actually have pictures. You can see here that there are some interesting things. These are after evaporation so we would expect to have all dry solids here, but you can see that the one on the left actually still has a bit of liquid left in it. Is that a problem? In this particular case, it turns out it's not a huge problem, but you're made richer by knowing that there was a potential issue here.

You'll notice also that number 46 is all covered in white, and it looks different from the other ones. I would look at that number a little bit more cautiously because it looks like it actually burst and started to bubble over while it was evaporating. Those are the things, again, that are not provided in a typical research article in chemistry.

Down here this is actually on Flickr. Everything that we're using is as open as possible. This is the nice thing, that you can get random people coming in and making their own contribution or using the data in a way that you wouldn't expect. This guy thought that it was a good looking picture. It had nothing to do with chemistry, but that doesn't matter. This is the Web 2.0 type of information sharing.

We also make use of YouTube. That actually is a very efficient way to record experiments because, again, instead of asking the student to write every detail of what they did you could just do a quick video, and I can actually ask myself questions: did we hook this up correctly? Where is the thermometer? All kinds of things that are gone if they don't record it.

Here, we are actually using very short clips. We're not talking about recording the entire experiment over hours. We are talking about a 30-second clip. Show me your setup and then they don't have to write it up. This is a way of learning....

Audience Member: It seems like 20th century technology where you have to make animate recordings that are translated into digital as opposed to, maybe, in the future having smarter machines that automatically record what you're doing.
Jean-Claude: Yes.
Audience Member: ...a chance to measure temperature and stuff like that.
Jean-Claude: That's where we're headed. In fact, I'll show you a robot that we've used exactly in that way. The thing is, you know, we don't have robots for everything, and we don't have all the stuff accessible. We have to do what we can with what we have. But I agree, if all this could be automated, it would certainly be a lot easier.

The other things that we show are the calculations. This is a really simple measure. You are basically just seeing how much material dissolves in a certain amount of liquid, and you're just measuring it. It turns out there are a lot of calculations in that. You have to weigh the empty vial. You have to weigh it with the liquid. You have to weigh it after it evaporates.

It's clear that there are a lot of places where you can make mistakes. If you see a number that looks strange, I would come here first to actually take a look at the calculation. Maybe, the student made a mistake. If they did make a calculation mistake, then, at least, I know that and I can deal with it. I can drill down to see exactly where the problem might be.

Here, we're making pretty extensive use of Google's spreadsheets. It's a really great way to share any kind of calculated information like this, and you can do all kinds of calculations. It's totally free and hosted.

The other nice thing about the Google spreadsheets is it's not that dangerous to make them open for editing because you can see the history. If someone comes in and just completely deletes all of your values, it's a little bit annoying. It hasn't happened yet, but it's not a big deal. You can just basically go back to a previous version, and it will restore it.

This is nice because we can now make these spreadsheets editable to anybody, so someone can come in here and actually mark something up and write a comment. You can color code something. It becomes so much more flexible than having to give people permission to come in and modify a certain spreadsheet.

Now, if you've gathered we use a Wiki actually for the lab notebook itself, There's a couple of reasons for that. The main one is that the Wiki gives you a page history. I can see every single version since the beginning of that experiment. I can see who made the contribution. By comparing two versions, I can see what each person added at each point.

We're using Wiki spaces, and Wiki spaces show the changes in green by the stuff that's added. It shows the stuff in red by stuff that's deleted. Often times, I will write a comment and ask a student a question, and then they will address the question and remove my comment. That's exactly what happened here. The red stuff was my original question.

This is actually also a great tool for interacting with students, especially graduate students where that's what they are supposed to learn how to basically report on science and how to make conclusions from their data set.

It doesn't end with just pictures. There are all kinds of different data formats. In chemistry, we are very big on spectra. That's how we prove that we made certain things. That's how we prove purity of things, and NMR is by far the most useful of those special techniques.

Here, we're making use of JSPEC view and the JCAMP-DX format. Are any of you familiar with that, JCAMP? It's a very handy format. It's open. It supports any XY data, and you can convert a lot of the proprietary software or, say, file systems into different instruments into a JCAMP format.

Once you have it as a JCAMP format, you can put it up on the web in a way that a browser can interact with the data. You don't need for the person to download software to view it. It runs in Java. Basically, they click on a link. It pops up the spectrum. If you drag your mouse across, you can actually expand any peak that you want. Unfortunately here, because we're missing half a screen, you can't see that, but there's actually a very detailed peak. You cannot see the details in the original spectrum.

This is a big deal because in a traditional publication you do supply supplementary materials, but it's generally in the format of a PDF. You cannot zoom into a PDF peak, and there's a lot of information there in terms of purities, in terms of all kinds of things that you would like to get at to figure out what happens.

Getting into indexing, again unfortunately, we are missing part of the screen here. I'll try to describe it as best I can. Over here, there is a list of compound names. So, we're looking at a list of solvents, like toluene, ethanol and vanillin that I'll be talking about. We can represent these different molecules using different things, like INChIs and InChlKeys and SMILES code. Anybody here work with those? OK, one person.

Basically, these are the contemporary way of representing molecules using linear text. If you want to represent toluene, you could type toluene, but there are many names for it. Methylbenzene is another name. How can you represent that in a way that a machine, for example, could read it unambiguously?

So, there are SMILES, InChls and InChlKeys. I would certainly be happy to talk to anybody about the details of those, but the InChlKey is always the same length no matter how big the molecule is. That's nice for indexing in Google.

For example, if we click on this link which is vanillin, you can see that it pops up my lab's work, and it also pops up some paper where that particular compound shows up. These are starting to be used more and more. They have huge advantages in terms of compressing information and making it sure that it's absolutely corresponding to the molecule you want.

Audience Member: What are the InChls?
Jean-Claude: The InChl, you actually can't figure them out. You'd have to look them up, but there are web services so you can use Self. You can use different kinds of web services.
Audience Member: Self is not representative of the structures.
Audience Member: But, InChl is represented of its structure and its connectivity is represented here.
Jean-Claude: The InChl, actually, the big reason why the InChl started to be used - just for small molecules the InChl are fine, but when you have molecules that are medium sized they are so big that Google can't index them anymore and that's a problem.
Audience Member: Now, we're taking [muffled voice].
Jean-Claude: These things are fairly recent. InChl is pretty recent. It's only in the past couple of years. InChlKeys is, maybe, one year.
Audience Member: [muffled voice]
Jean-Claude: The problem with the cast is the copyright. That's the big problem.
Audience Member: What about abstracts?
Jean-Claude: Chemical abstracts? People do use them, but if you're talking to people that are interested in indexing a lot of stuff without having to worry about the legal aspects, they tend to stay away from the cast number. That doesn't mean you'll find them on Wikipedia. You'll find them in a bunch of places, but, yeah, you will definitely find them. But, the whole copyright issue is actually a big problem with that.
Audience Member: Since you're interrupting, let me say that as a reader I would much rather have trust than proof, that is, who do I trust? I'd like to trust, read like a proof. But, to have every reader have to go back and verify things from the beginning, it seems to be a very great burden.
Jean-Claude: The point is you only do it when you have to. The reason that I looked at that number is because it didn't make sense in the context of the other numbers. I didn't drill down to every other number.

Right now, you cannot do that from publication. The peer review process does not cover that at all. So, that is the problem right now. My only option in a paper is that I have to email the author and hope that they respond by sending me the information that I want and that just doesn't happen very seamlessly.

Audience Member: Proof is needed by the reviewers, and that is proven by me in advance.
Jean-Claude: Yeah. So, basically, use trust for as long as you can get away with it, but when you are trying to repeat an experiment and you can't, you're kind of stuck with the whole trust issue.

I don't know if it's a post-doc who wrote that up. I don't know if it's a new student. Why not just give me the proof? I mean, it's really not that big of a deal. They already have all this information. It is just a question of making it open and having people access it.

Audience Member: Recent publications are devoted to doing this sort of thing, producing validated data over, let's say, a sequence of temperature, pressure, and this sort of thing. Is it that there are so many chemical compounds out there and so many different variables that even the publications that are devoted to that would work beyond this in the old days, I think. [crosstalk] just can't do it all.
Jean- Claude: I mean, there is so much to do that people haven't done them. Quite frankly people and companies probably have done these measurements and have no benefit in sharing them. So, there is a lot of that kind of thing as well.

But the level of detail that I am talking about, even if you look at it in this database, I don't think that you can access... Let's say that you challenged a number in this data base. They're not going to send you the lab notebook pages where they got that information from. And that is what we are talking about. We are talking about transparency at the level of no insider information from the research group to the rest of the world.

So, there may be stuff that was added in the past hour, that is incorrect, by my students. And that's OK. We accept that as part of the process of working in the open. That is no different that any other place where we work in the open.

OK. So, one of the main databases that we do use is ChemSpider. In fact, the CEO is coming tomorrow, Tony Williams - two o'clock, Disquay 109. It is a great opportunity. He will be talking about all this and he will be doing a demo if you want to interact with him. But, this is really a fantastic resource for manipulating organic chemicals.

What we do for the lab notebook... I don't want to be running software on my servers that will actually do substructure searching, that will be doing any kind of analysis like that. With ChemSpider I can actually farm that out. And this is free and hosted for you, for anyone to do.

You basically just link from the molecule to this and then it gives you all this information. It gives you the smiles, the InChl, the InChlKey.

Audience Member: And the empirical formula, is what it gives.
Jean-Claude: OK. Empirical formula, yes. What we normally call it. But, that doesn't have enough information to...
Audience Member: [muffled voice]
Jean-Claude: Yes. Certainly that would be what we file under that.
Audience Member: Where is ChemSpider pulling this in from?
Jean-Claude: ChemSpider is pulling it all over the place. They have links to the vendors. Down here you see experimental properties, melting point, and boiling point. They are actually links from ChemSpider to where they got them from. It could me an MSDS sheet. It could be any number of things.
Audience Member: Does this generate on the fly? Do they go out and get this stuff when you ask for it or has it harvested a lot?
Jean-Claude: ChemSpider has already harvested, like, over 21 million compounds.
Audience Member: So, you would call this a search engine for the invisible web, the chemical properties, which works like Google.
Jean-Claude: Yeah, that's a good way to look at it; a search engine for the chemical invisible web.
Audience Member: How does this compare to the Walstein that was acquired from Crossfire?
Jean-Claude: Well, it's free first of all.
Audience Member: What's that?
Jean-Claude: And it's free.
Audience Member: And it's not in German!
Audience Member: How complete is it?
Jean-Claude: Yeah, Walstein is certainly superior in terms of reactions, but this has actually started to compete with... If you're looking for properties like a boiling point, things like that, it is actually getting pretty comparable, I think, we used in the past year. This is all new stuff here. Going in the next five years, this is going to change. It is going to be totally unrecognizable. Right now, as of today, this is the state of the art.
Audience Member: [muffled voice]
Jean-Claude: We still pay for it. Drexel still has Walstein, SciFinder, and all those pay services, and they're still useful. As long as they still provide information that these sites don't we're still going to keep using them and teaching them.
Audience Member: So, you're not against this empirical overview of the literature associated with ChemSpider and Walstein?
Jean-Claude: Each database has its own things they provide. ChemSpider, for our purposes, is extremely useful because it has these key things. Like if we want to generate the InChlKey, for example, this is by far the easiest way to do so. Go on ChemSpider, put your compound in, and then you have the InChlKey, you can copy and paste.

So, there is a whole bunch of things. You've got synonyms...

Audience Member: Is there sufficient meta data for a program, not a person, to go and find information, that it will point to some arbitrary substance and then take it and use it somewhere else?
Jean-Claude: Yeah, there are web services that you can hook up. Now, you can't get all the information because of licensing issues. Like some of these properties you can get them on one page, but you can't download 10,000 of them. So, there are those kinds of limitations. But, ChemSpider tries to be very good about providing everything they can in the form of a web service. So, that makes it very useful.
Audience Member: OK.
Jean-Claude: The other nice thing about ChemSpider is we can also upload the raw data, like the raw spectra, and they're also in JCamp format. JSpecview looks at it. And as of, actually, yesterday we can deposit our solubility properties. So, Tony actually made a special parameter for us to put our solubility properties.

That is going to be very interesting as people... If you want to find the solubility in methanol you are going to be able to, over time.

OK. Let's see how much time I have here. This is until 1:30, right?

Audience Member: That's right.
Jean-Claude: OK.

So, this is something that I am very exited about, that also happened quite recently in the past few weeks. I told you about Open Notebook Science and I set up an Open Notebook Science Challenge, which is for people from around the world to actually do solubility measurements of certain kinds of organic compounds and report them to the central place. So, we actually have 127 measurements now. There are just a variety of solvents. There's different people that did similar techniques, but not exactly the same techniques.

We just recently got funding from Submeida for these Open Notebook Science Awards. And Drexel students are eligible for them. Actually, any student at a university in the States or in the UK is eligible for them. They are $500 a piece and there's ten of them over the next ten months.

This is kind of neat because we have judges that are chemists, that are either in academia or high up in industry, who will actually give feedback to students on their lab notebook. So, the student might put a report and a judge might come in and say, "You didn't provide enough information." or "This is wrong." or "Look at this." So, what we are trying to do is have a peer reviewed Open Notebook. That is something that hasn't been done and I think this is going to be very exciting.

It requires something of a challenge to get everyone motivated to actually do it. So, these are not big awards, but they're interesting enough that we have five students now contributing to this. It will be very interesting to see, in the next ten months, how this is going to play out.

This is what we have so far. Each one of these experiments can have 20 to 40 solubility measurements. That is not a big number, but, like I said, they cover about 127. And these all link to the lab notebooks.

You will see on the top here there is a summary data. So, again, we don't expect people to click through every experiment to find what they're looking for. There should be easy ways of accessing that information. And again, Google Doc is a really good way to do that.

Here, I have validated solubilities. What does that mean? It means that I have decided that I don't see anything wrong with these data points at this time. Now that could change tomorrow, but I feel comfortable enough to put them up on here. And they link back - you see these links, experiment 208, 205 - these link back to the actual notebook pages. So, if you want to see how these numbers are generated you can.

Here, it just shows up as a nice look-up. You can see the solubility, and it's got the smiles there which is another way, it's like an InChl - the solvent and the solute. And so this is a very large spreadsheet. And Google Docs can't handle very large volumes, but this will probably work for at least the first thousand entries that we have. That's roughly what we're looking at in the next few months.

Now, this is where it got really exciting in the past week, is that - I don't know if you guys have heard of Google Visualization API - anybody try to use that here? This is very cool stuff. So, Google Docs, again, it's free, it's hosted; and they actually have an API where you can query it and you can return results. So, if I search for Vanoline, it gives me all the measurements of Vanoline to date in all the different solvents.

Unfortunately, we're missing the key part here, which is the names of the solvent because the screen isn't big enough! But, it turns out that some of these are actually farther away then we might expect. On the right, you can see one of those examples. So, for methanol, we get values ranging from 2.8 to 4.2. So, this is the same experiment, run by different people, and we're getting values that are pretty far apart.

It actually tells us we have a mean of 3.2, and we have a standard deviation of 1.4. So, how do you interpret that? Well, if I were to only give you the mean and the standard deviation, you could probably use that for some purposes. But, maybe you want to see what that one looks like it's very big - which is actually one that I personally did. It looks like it's out of the rest of the group.

You might want to look at what's different about that experiment from the other ones. From the live notebooks I'm starting to think that it has to do with how long they were vortexed - how long they were actually mixed. Because it may take actually a longer time than you think to reach a saturated solution. But, then again, we can only ask that question because we have access to the raw data.

Audience Member: ... sciences, in the field of chemistry, a kind of passive knowledge of what actually happens at the lab bench. I am thinking about medical experiments [inaudible] biology - DNA extraction, these sorts of things. There is something that you couldn't learn even from the best value. You can get the rest of people's take to the shelf over the sink - you can share that.

But still, the only way to learn how to do these things or to do them correctly was to go to somebody in the lab and actually physically do the experiment. This is something that is still kind of missing, it seems to me, and could be even exacerbated in this reliance on 'he's done it' and you get a lot to think of that.

Jean-Claude: Well, this is stuff you have to record anyway. We're just making it public.
Audience Member: Right.
Jean-Claude: We're not really changing the workflow.
Audience Member: Right. But, there is a question of truth, and replicability. At some point, you also have the human.
Jean-Claude: No. That's the whole point. If you have a lab notebook and links to all the raw data, you don't need the human.
Audience Member: So, you know exactly what the pH of the water coming out of that faucet is?
Jean-Claude: If we measured it. Then again, there are some chemists who are better than others, and that's the point of this too is to show ways of recording science that's better. And if they didn't record it, you'd want to know that they didn't record it.
Audience Member: At what point are the things not rationally recordable. I'll give you an example from clinic indexing. A kind of customer, a former colleague of mine, Paddy Goodman; well, someone wanted a photograph of the squish ivy on the mantelpiece in the Oval Office. Now, photographs of the wall, these are indexed. What do you record in an image that is retrieval-worthy for some purpose? What do you record in a laboratory experiment that somebody is going to be concerned about down the road? Because you cannot.

I would argue, 'record everything.' So what is being recordable and what you wish - 'Damn, I wish I'd recorded that!' - later on.

Jean-Claude: And that's happened; and that's the conclusion of some experiments. 'We should have recorded this and guess what, next time we do.'
Audience Member: So, how does that information gets shared out so the people know this?
Jean-Claude: That's what this is for.
Audience Member: Thanks. So, this is how you do it?
Jean-Claude: There is no other mechanism that exists right now to enable that sharing except if you happen to work closely with someone. Yeah, this is really the point of Open Notebook Science.
Audience Member: But isn't your point here, that assuming all four of those measures in your notebook, then yours would surely spend more of the time on mixing, and that would be different.

Or, yours would show you measured the mixing time, and theirs would show they didn't.

Jean-Claude: That's in fact the case, yeah. With some of those they did not measure the mixing time. So, now that brings the question...
Audience Member: But, at least it's a clue...
Jean-Claude: It's a clue.
Audience Member: ...and, of course, it maybe a red herring because that may not be what affected it.
Jean-Claude: Exactly, but at least we could design that experiment. Yes. Yeah, there is not one way to do this measurement; you can evaporate, you can also use UV; there's a bunch of different tools you can use. And that's the whole point of this is you don't want people to have to go through each thing; you don't want to have to Google stuff to find it. You want to have this interface. Or you can access this now, and you can add your own intelligence to this. You don't need to know the chemistry to use this. But, you may find an anomaly to give to the chemist, and then the chemist can look in to what's causing this anomaly.

But, yeah, all these numbers should be very close to each other and they're not, so there is something... Somebody is not doing it right.

So, down here, you see there is a link. This number - that first one - it's experiment 207, sample number three. You click on that it takes you to the lab notebook, and you will see that in fact, that is my experiment. That I did with my collaborator from Southampton.

Audience Member: I think you would have done...
Jean-Claude: That's the point. That everything - yeah. That you maintain the chain of where the data came from at all points in time.
Audience Member: It is a broad...
Jean-Claude: Yes.
Audience Member: ... what we did then, holding them.
Jean-Claude: I mean, it's very surprising in chemistry that the whole concept of providence is really not widespread. You can open up a book of melting points, and it doesn't usually tell you where they got those numbers from; but it's from a trusted source. I don't know if that meets...
Audience Member: ...physics and references in the last part of the...
Jean-Claude: Well, they may reference the actual papers but I can tell you when you read those papers that there's often not a lot of information. But, at least here, you can see how much information was recorded.
Audience Member: Lot of detail with it.
Jean-Claude: Well, in fact, patents is probably the reason that chemists are so hyper about keeping a lot of notebooks. Because those are the legal documents that you use in a court case when you want to prove that you were first. And now, we're trying to do the same thing - in using them openly in real-time for another purpose is not legal. Things are actually very hard to fake when you have to provide all the raw data. Things are easy to fake if all you have to do is put a number in a table. Oh, that was my yield!

It's very difficult to fake all the raw spectra, to fake all the weights, to fake all that stuff would be so difficult that you would get caught.

Audience Member: [muffled voice]
Jean-Claude: It's actually easier to just be truthful than to try to figure it out!
Audience Member: This is an excellent tool for researchers in, say, academia. Is there any thoughts about how this might integrate into the workflow of say, a pharmaceutical company or an optical...
Jean-Claude: Well, this is certainly available to the pharmaceutical companies. If they wanted to find out what solvent to do a reaction, they could certainly find that information on our site. So, going forward we're going to have more and more values, and that's something that I hope will be copied as well, but, yeah absolutely. And there's not any intellectual property issues, everyone comes on board collaborating, making things open, so that simplifies things a lot.
Audience Member: It means that the private companies will know this is a resource, but probably they are not going to participate.
Jean-Claude: That's what they're doing now..
Audience Member: It's different now. They might do a version of this internally for their own research, and would that become a standard tool that they can use live?
Jean-Claude: I mean, that's actually what they're doing with ChemSpider, because they don't want people to see what they're searching - they can get a copy of ChemSpider for a fee.

So, there are basically different business models emerging from this. I'm coming at it from the standpoint of we want good data, we want to report good data, and we want to publish the stuff. That's what we're doing, but, absolutely, it can get pretty tricky.

Anybody here working on RDF? Maybe we should talk later because I want to make sure that I do cover this thing with the robots. I have very little time.

We were talking about workflows before even if we didn't use that term: protocols, workflows. We have actually been converting what we wrote to be human readable into a very standard format that machines can read.

Basically, it says here common name methanol InChlKey. It gives you the InChlKey, and then it gives you the volume in milliliters. This can be scraped pretty easily and can be put into a database, for example. Is anybody working with people at MyExperiment? They're basically very heavy in the bioinformatics area.

What they call an experiment really means taking information, submitting it to BLAST search or something like that, and then getting information back. Some of these can actually be pretty tricky with lots of web services being called, so there's a place in my experiment where you can upload those.

In the past two weeks, they've actually opened up what they take as a workflow to include what I have on the right there, which are physical transformations or physical workflows, not just converting information. So, this is potentially very exciting. MyExperiment people are very heavily invested in this, so I think that there's definitely a future.

Audience Member: Is there actually a workflow notation they do on the...?
Jean-Claude: There is a specific notation for their bioinformatics queries, but right now, I think they just decided, "Look, just open it up for people with physical processes." I guess, they're going to look later at standardizing those. No, right now ,it's just a page.

Chemical Markup Language: anybody here involved a little bit? That's something else we haven't discussed.

There are always these different ways of making things machine-readable. I would like to talk a little bit about the use for it. The reason I got involved in solubilities is that we were trying to make compounds as anti-malarial agents. If we can predict the solubilities, we can actually figure out how to get good yields off of our target compounds. It just hadn't been done, so that's why we're doing it.

In parallel with recording the values, we are working with Rajarshi Guha at Indiana University who's actually doing modeling to predict the solubilities. So, in the coming months that would be very interesting to see what the predictions are verses what we've observed. Let me skip through here.

Rajarshi is doing docking. Is anybody doing docking? You basically have an enzyme and a small molecule, and you try to dock into it to try to inhibit it. Here we're also using Google Docs to report on the results of those docking runs, and again we're using Google Docs to do that as a very simple way of sharing information.

Just to tell you about the robots a little bit, this is the Ugi reaction that we're actually doing to make these anti-malarial compounds. We wanted to see if we could optimize this by using robots. So, Rajarshi lent us their mini mapper system, which is basically just the syringe on an arm. It can go and pick up some liquid and deliver it at different positions.

We were able to do 48 reactions in parallel. These are little filter tubes about this big. The idea is that you have the robot add the four solutions, changing the parameter slightly, and then it precipitates. We filter and weigh it, and the weight is the yield of the reaction. The nice thing as was mentioned earlier about a robot is the robot actually spits out its own log of what it did.

That can be very useful from the standpoint of figuring out what it thinks it did, which is not necessarily what it did. This actually can be more problematic than having a human do it because you're doing a lot of reactions in parallel. If you have a systematic error, you sometimes don't know why you're having a problem. We had that problem.

The machine wasn't programmed carefully, and what happened was we didn't realize it wasn't changing the solvents completely. We were getting numbers definitely, but as a chemist I was looking at the numbers and saying, "Something is wrong." It took awhile actually. We did probably over 1000 experiments before we knew what the problem was.

Eventually, we did do it. We recorded, so this is the calculation part of that. We wrote this up on a Wiki. Now, we're talking about a document that its intended audience, or its intended vehicle, is actually a standard journal. We wanted to write this to see, first of all, if a publisher would take it. Will he take a paper that was written in the open? The answer was yes. The publishers will definitely do it.

We submitted this, and the nice thing about this is that we can actually link to individual lab notebook pages from the journal article. Down here, it says the melting point was taken from experiment 99, for the proton the NMR was taken from experiment 203. Right now, there's not a mechanism in chemistry that will actually do that because the lab notebooks are not made public or even available.

But here, because we do have those documents available, we can point to them. That means that basically that was a different batch than that one, and in a traditional article you don't make that distinction. You trust people that all of their stuff is the same quality.

If you're interested in automatic markup of documents, tomorrow Tony Williams will be talking about this in the ChemSpider talk. This is basically software that goes through a document and it figures out the chemical, and then you can see that it actually knows how to draw it. This kind of markup is actually very, very interesting going forward.

This paper, which is still under peer review, should appear in the Journal of Visualized Experiments where they actually send a team of people to do a camera recording of our experiments as well. So, there's the written, there's the video, and this will be the first example for us of having a peer-reviewed article linking back to the lab notebooks.

Another handy thing you can use these days is Nature Precedings. If you're familiar with Archive that handles a lot of the physics preprints. Nature Precedings will handle more broadly. It's nice because it does have the editorial approval of Nature, a nice DOI, and a nice author list that you can use.

While my article is pending under peer-review at JoVE, I can actually put it up on Precedings. I can talk about it, I can give people a link, and they can download it. This is actually a very handy tool and very complementary to the peer-review process.

Audience Member: Publishers don't care....
Jean-Claude: No, many publishers don't care. What they don't want you to do is to take the final document with all of the editing that they did on a PDF copy and make that available. That's more the issue. They really don't have a problem with text. That's not the case for ACS, but it is the case for many others.

Some people are asking about Second Life. We don't have enough time, but we can do a lot of the same things looking at spectral and molecules. We have an eCrystals repository where people can submit the 3D structures of their molecules. That's another way we can make things open. We can report on the activities of the various compounds.

I find outcomes in terms of the malaria project is we actually have nine compounds, which show activity against the enzyme, and we have four that show activity against infection of the malaria parasite into the red blood cells. This is neat because this has not been written up yet, but the information is available to anyone who actually wants to use it.

There are other people who are doing Open Notebook Science. Gus Rosania up in Michigan also works on drug transport, and all of his students are using the same Wiki approach. Cameron Neylon from the University of Southampton is also doing the Open Book Science, but he's not using the Wiki. He's using a modified blog engine to do much of the same thing as we are.

I'd just like to thank my students. I think, I've run out of time, right?


Labels: ,

Wednesday, July 09, 2008

ITConversations Jean-Claude Bradley Interview



Announcer: Jean Claude Bradley, an associate professor of chemistry at Drexel University, is a pioneering practitioner of open notebook science. On this edition of Interviews with Innovators, Bradley explains to host Jon Udell that he believes scientific research happens better and faster when the entire process is transparently narrated online. From IT Conversations.

[music]
Phil: Hi, and welcome to IT Conversations. I'm Phil Windley, the executive producer. Today, I'm happy to bring you another program from Jon Udell's Interviews with Innovators. This program is made possible by Microsoft's Channel nine and Channel 10.
Announcer: By joining as a paid member, you'll not only help keep the Conversations Network on the air, you'll also have access to our Premium Edition programs without promotional messages. Just click on the "Join Now" button on our website to learn more. Our audio files are delivered by Limelight Networks, the high performance content delivery network for digital media.
Phil Windley: And now, here's Jon Udell.
Professor Jean Claude Bradley: So, I started to do open notebook science in the summer of 2005. I've been a professor at Drexel since '96, and I've worked in several labs, doing gene therapy work, making DNA chip type of chemistry, and my background is a synthetic organic chemistry PhD.

So, I have experience in different labs. And one of the things that you notice if you work in a lab is that most of the stuff that you do never gets seen by anybody. As a chemist, you do an experiment, and you have a lab notebook, which is generally paper. I understand the industry, they're moving more towards the electronic notebook systems. But, there, also, the control is pretty tight. So, effectively, it's pretty similar to the kind of exposure that you'd have with paper.

At least, within your own company, you could share things. But, in terms of other people that you don't know benefiting from what you've done, I always wondered if there would be a way to do that.

And it really took until about 2005, when social software technologies were really very, very easy to use. And there are plenty of examples of people using blogs and wikis and these kinds of things, and you could get free, fully hosted services that would be inconceivable just a few years earlier. Because that's the other thing that was really important to me. I want to spend my time being a chemist. I don't want to spend my time managing a server.
Jon Udell: Yeah.
Professor Bradley: In chemistry, if you want to do a lot of the high end technical stuff, with respect to computers, you try to find a student that is comfortable, or even interested in, programming work as well as chemistry. And that's actually not that easy to find.
Jon: Yeah. That's always a bottleneck. So, I just want to ask. You mentioned that an idea here was for you to be able to share your work with other people so that they could benefit from it. But, I assume that there is an inverse thing happening here, too, which is that perhaps other people can help you with your work when they see what you're doing or what you're trying to do.
Professor Bradley: Yes. I mean, originally, I don't know what my expectations were in terms of what would happen. Theoretically, people could come in and start to comment on our experiments. And we have had that. We did get a few comments from people that just came out of the blue. But, mainly, people are not that willing to go out and do that. The main benefit that I've seen, from our end, is really to be able to find new collaborators that we would have otherwise not found.
Jon: Yes.
Professor Bradley: So, it's data finding data.
Jon: Yeah, exactly. So, explain how that actually happens for you, what the process looks like.
Professor Bradley: OK. So, the notebook that we're using, the actual lab notebook where each page is an experiment, that's on a wiki. I use Wikispaces. It's free. It has a Creative Commons by attribution license by default if you have the free account. And their service has been great. They have great functionality. No limitations on the free account. The pages look pretty similar to what they would in a book, except that we can do more stuff, like we can link to the raw data, actually.

So, if someone wants to dig to a particular spectrum, or they want to zoom in to something, or they want to verify a statement that the student or the PI made, they can do that. So, that's a requirement, that our lab notebook on Wikispaces is our official lab notebook. Students are free, certainly, to use paper for any purpose they wish, but the official lab notebook has to be on the wiki.
Jon: And who has been discovering and exploring these notebook pages and the data behind them?
Professor Bradley: Well, we get about 150 to 200 hits a day. And its people that vary from the educational side of this, which we haven't really talked about, to people looking for the NMR of like this morning, something was looking for the NMR of crotonic acid. If your listeners don't know what that is, it's basically just information about a compound. You wouldn't think to use Google as your first resource for that, but honestly people are using Google. I find myself using it ahead of using anything else, just because it's so quick.

And the fact that people are actually getting the information that they're looking for. I mean, we have taken the NMR of crotonic acid, and people can actually look at that and dig into any part of the spectrum that they want to.

So, I see a lot of that kind of stuff happening, where people are looking for very specific information. They might be looking for a boiling point of a solvent or something like that. And I know from their hits that, most of the time, I think, they're actually finding what they're looking for.
Jon: OK.
Professor Bradley: But, people are not very willing to contact you about that. So, I know what they were looking for. I know that they found it. But, I don't know if they went on and actually did an experiment from that, because that's something that I can't control. I can't actually contact the people because all I know is their location.
Jon: Yeah. But, you did mention that, through this process of exposing your work on the blog, you've come into contact with people who have become collaborators.
Professor Bradley: Yes. So, I didn't say "never." I just said most of the time; people don't end up contacting you. But, once in a while, you do, in fact, intersect with someone who is interested in open collaboration, and they're willing to participate. They have the skills to actually do something.

For example, we're making anti malarial compounds. So, one of our collaborators is Rajarshi Guha, from Indiana University, and he does docking calculations. So, he basically uses just a computer to tell us which compounds we should make. And then, since we're the synthetic organic group, we can actually go out and make those compounds, and the have them tested somewhere else. For example, Phil Rosenthal's group at UCSF has tested our compounds for anti malarial activity.

So, those are the kinds of collaborators that I'm very excited to be able to interact with, because they're doing exactly the part of the process where we're not the expert.
Jon: Yeah. So, in the normal course of events say, 15 years ago you would have run into people like this at conferences, or you would have read papers that they had written. In other words, it's not that there was no way for people to discover collaborators and to get together. Is it just a question of, it's one of those differences in degree becoming a difference in kind, kind of a thing?
Professor Bradley: Well, it has to do with speed, essentially...
Jon: Yeah.
Professor Bradley: Because it's very true. I mean, I can write a paper, and someone can find that paper. But, in order to actually get to the point of having a paper, you already have had to do a lot of work that actually works into a nice, little story. And a lab notebook isn't like that. A lab notebook is basically just stuff that you tried, and you try to get to a conclusion as quickly as you can, with sometimes limited information. And so that's a totally different kind of interface.

But, I should say. You were asking about the whole system. The lab notebook is definitely an integral part of the system. But, there's also a blog. There's also use of other social software, like FriendFeed and all these different, additional ways to connect up with the information.

But, regardless of how I talk about it, I'm always able to put a link. So, if I make a comment on my blog that's higher level, I don't want to repeat all the experimental details. I can just shoot a link over.
Jon: Yeah.
Professor Bradley: If someone has an issue, they can look at that. And I think, that's more where the utility is, if anyone wants to dig into any statement, they can. It doesn't mean that everybody's going to do that. It's only going to be the people who have a vested interest in a particular reaction, for example, that might bother.
Jon: Yeah. But, it's there if they choose to follow it.

So, there are, well, a variety of reasons why this isn't the dominant way things are done yet. And I guess we don't know if it will become the dominant ways things are done. But, talk about some of the push back, the reasons why, for publishing and tenure issues or for competitive issues, people would tend to not want to go this route. And how do you have that conversation with people?
Professor Bradley: Well, yeah, you're absolutely right. I mean, there are a lot of issues to consider, and they get brought up often at conferences. I really like to see the people's opinions on that. Although, at these conferences, to be fair, it's a self selected group, so a lot of the people there will be open already to openness.

But, you're absolutely right. In terms of doing traditional research, depending on the field but in organic chemistry, it's generally fairly secretive people do talk about their early results at conferences sometimes. But, this whole openness like you has, like with the genome or some specific datasets like that, and is not part of the culture in chemistry.

You also have the issue of intellectual property. So, obviously, if I'm doing my work in real time, I'm showing everything that's going on. I don't have time to protect it.
Jon: Right.
Professor Bradley: Now, in the US, you could still get away with that, but it's just easier to just not deal with the intellectual property. There are plenty of areas in science where intellectual property doesn't even come up, like cosmology, and there are still people interested in doing that.

So, I think, for chemistry, what kind of shocks people sometimes is that they sort of assume that everything they're doing can be turned into a profitable patent. But, the reality is that, most of the time, that's not true, and it's also a lot of hassle to actually go through, and it's very expensive for the institution.

So, it's really not that different than sending a paper out. As soon as that paper goes out, and if you haven't submitted at least a preliminary patent, you're not going to be protected. So, that's an issue that you have to wrestle with.

It's not something that I'm trying to convince people to convert to. I'm just basically saying I have a hypothesis that making your research fully transparent and as close to real time as possible is going to actually produce a qualitatively different kind of science progress. And that's going to require the cooperation of some people, but not necessarily the majority of scientists.
Jon: That's a good point. That's a good point. So, this network effect could be very powerful, but involve relatively few notes, in fact.
Professor Bradley: That would be fine.
Jon: Yeah.
Professor Bradley: Yeah. For me, I also want to move towards more automation, where machines are actually designing experiments, executing them and analyzing them, and then sharing their results. And in order to get to that kind of a system, for me it's obvious that everything has to be open, because, if not, the barrier that you have to go over to pay for services and try to get special permission to access people's data, it's just so large that it's just not going to happen very soon.

But, if all this information is open, anyone in the world who wants to, for example, look for correlations in our data, they can write a script, they can read it, and they can say, "Well, look, why don't you try this experiment?"

And ultimately, we'd like that entire process to be fully automated. So, one of the things that we're actually looking at this week, we have Mettler Toledo, they're doing a trial with us. They brought in one of their automatic reactors. They were going to see if they can get people to suggest experiments for us to do during this trial of a few weeks.

So, ultimately, I have a destination beyond the human to human collaboration. My view on it is skewed with respect to that. And you don't need to get the majority of people contributing to have that effect.
Jon: You said, in passing, something kind of provocative: that the machine would be designing the experiments. What did you mean by that?
Professor Bradley: So, we're mixing these compounds together, and we actually don't know right now what the governing factors are on getting pure products as precipitates.
Jon: So, it's kind of a large possibility space. And if you could automate the sort of march through that space, then you'd like to do that.
Professor Bradley: Yeah. And that's not going to be us doing that. That's going to be people from the artificial intelligence community, or from the bioinformatics community, who may take an interest. And I can't predict that. All I can do is I can make it available, and I can make it as convenient as possible for people to get the information.

And so, I'm using things like Google Docs, Google Spreadsheets, as ways of quickly sharing information, the summaries of the lab notebook, and getting people to put reactions that they want to do in another Google spreadsheet. So, now there's the opportunity to make the Google spreadsheet completely public. I don't know if you're aware of that.

And so, that's actually pretty cool. So, that means that people don't have to register and log in and do all kinds of things.
Jon: Yeah.
Professor Bradley: They can just find it, dump their information. And ultimately, I'm in control. But, if I see that we have the ability to run an extra few reactions, why not? And that would be something that the whole world would benefit from.
Jon: [laughs] That's a really interesting take on this that I hadn't heard before.

So, you've talked a little bit about this kind of machine to machine possibility. In the human domain, I guess there's still an awful lot of what I tend to call tacit knowledge involved in understanding how you actually get a result from an experiment, which involves a lot of variables that you may not even perfectly understand yourself. And so I know that you've been very active in the use of screencasting as a way of kind of capturing... Well, you should talk about, actually, how you see that sort of show and tell piece of it fitting into the picture.
Professor Bradley: Yeah. In terms of screencasting well, following your lead, of course I think, it's a fantastic technology. And I've used it pretty extensively for teaching. All of my classes in organic chemistry are recorded, and students can access them from anywhere. And it really does replace, in the sense of a classroom kind of interaction or I should say a lecture style interaction with students, it really does replace it.

If you have a really good screencast, good audio, everything that you do on the screen is recorded, students can get exactly the same kind of information that they can in a live classroom. That's actually allowed me to stop doing lectures, and assigning the screencasts in the same way that I would assign a book chapter.
Jon: Ahead of class?
Professor Bradley: Yeah. I mean, there is no lecture anymore in my course. It's all workshops.
Jon: Yeah.
Professor Bradley: So, the students watch the screencast, do the problems, and then they come to the workshops and I can help them one on one.
Jon: Mm hmm. Mm hmm.
Professor Bradley: So, it's a fantastic technology, very simple. It reminds me very much of the social software, because of how simple it is and the leverage that it has.
Jon: So, in this case, you said that having seen the screencast, which replaced the lecture; they then can come and receive really kind of individualized instruction. Is that actually happening kind of one on one? And if so, what's the continuing role for getting the class together as a group, physically?
Professor Bradley: I don't see a purpose of getting the entire class as a group together.
Jon: Really?
Professor Bradley: My objective is the students learn the skills that I outlined in my syllabus.
Jon: huh.
Professor Bradley: I test them on that, and if they understand the material, they will do better on the test. So, my role as a teacher is I'm looking at the student learning. That's what I'm focused on. And if that requires me to spend three and a half hours with a group of a couple of students because that's how they learn best, that's fine. If other students can just go out I get a lot of pre med, and they're very busy. If they can do very well, and if they can understand the material from the screencast and from reading the book, that's great, too.
Jon: Yeah.
Professor Bradley: I know that some teachers do take it personally, this whole attendance thing. But, to me, leverage the technology. Keep remembering what it is that you're trying to do as a teacher, and if you can do it as effectively with technology, you don't necessarily have to be face to face.
Jon: So, I assume this is at least somewhat controversial. [laughs] You're in a business which, for a thousand and more years, has been kind of predicated on the lecture format to a group of people assembled in one place.
Professor Bradley: Yeah. Yeah. And they are still getting that, in the sense of the screencasts are simply recordings of those lectures. It's just that they can pause it. They can fast forward through it. It is still the lecture format.
Jon: They don't have to take notes. Yeah. Yeah. Oh yeah.
Professor Bradley: They are still used to someone's going to write something on the board and somebody's going to explain something, so I think, it's not that difficult for them to migrate from that.

And the big controversies about doing podcasting classes: will students still come in? It depends what the intention of the professor is. If you actually do take it personally that your students are not going to show up, and you do things like leave out certain things from the podcasts or you only give audio if you leave things out on purpose, then of course they're not going to do as well if they don't come in.

But, I'm not approaching it like that. I'm just approaching it: can I replace the lecture interface? And I do think that we can do that. It works pretty well, actually.
Jon: And then, so talk a little bit more about this other aspect, the sort of face to face, personalized instruction. How is that evolving? Because there's time constraints on that. So, how do you manage your time?
Professor Bradley: It depends on how many students come. Like they just had a test yesterday, so there was a pretty heavy review session today. If a few students have a similar problem, I might just go to them, have them work together. If there's only one student that has that particular problem, I have no difficulty with just working with them one on one.
Jon: Right.
Professor Bradley: I typically will not give them the answer. So, they'll show me what they're doing, I'll give them a hint, and then I'll move on to the next student or next group of students. And I'll rotate like that. And I think, that works pretty well. Sometimes it's a little bit strained, like the last 15 minutes of the last workshop for the final exam. Yeah, it's going to get a little not everybody's going to get their questions answered, necessarily. But, all the other workshops, really, there is plenty of time to do that and does it well.
Jon: There are all sorts of things that you can imagine once you eliminate that particular constraint of "we're all in the classroom together at the same time." And one of those is that, well, your class is one of a number of classes that are going on in various parts of the country, various parts of the world.

So, at the same time, there actually are other people who are teaching and other people who are learning the same subject. And then, if you relax the same time constraint, while there's the class that did this the year before and the year before that, and also in these other locations. Do you know what I mean?
Professor Bradley: Yeah.
Jon: There's this notion that there's a kind of a wide area kind of collaboration that we can at least imagine would reshape the process of education in pretty interesting ways. Have you had a sense of how that could develop?
Professor Bradley: Yeah. I think, one of the best ways of doing that is through Second Life.
Jon: Really?
Professor Bradley: Because my students have extra credit assignments where they need to represent like a reaction that they learned in class onto Second Life. And we now have tools that allow us to build molecules pretty easily like full 3D molecules that are realistic looking using some tools that Andy Lang has actually developed with me. And so I've had my students do that. And they also can take quizzes in Second Life.
Jon: So, you're saying the molecular modeling is being done in Second Life, or it's being done in some tool which then imports the thing into Second Life?
Professor Bradley: Yeah. I mean, technically, it's done. What we need to figure out is the SMILES, or the InChI code, for your molecule.
Jon: The which code?
Professor Bradley: It's called SMILES.
Jon: OK.
Professor Bradley: It's a way of representing a molecule with a string of text.
Jon: Yeah, OK.
Professor Bradley: And that you can get from a number of places. We typically use like ChemSpider, or ChemSketch.
Jon: OK.
Professor Bradley: They're both free. And you can just draw the molecule and just export the SMILES. And then you basically go in Second Life. I give them this little machine. It's called a rezzer.
Jon: Yeah.
Professor Bradley: And they simply talk to it. They just dump the SMILES there, and the molecule, it actually hits a couple of servers to minimize the molecule to make it look realistic. And it uses that information to build it right in front of you. It's pretty neat, I have to admit. The first time the students actually see the molecule being built in 3D in front of them, it's great.

And the nice thing with these new tools is I don't have to spend any time whatsoever talking about scripting or anything like that. It's very, very simple.
Jon: What's the Second Life connection here? In other words, you could refer a student to a web page that would launch something that would build the molecule and enable them to view it in a 3D fashion. But, there's more to this, I assume, right? There's this notion that, well, there's kind of a shared space that they're virtually present in. And so, what's the significance of that in this context?
Professor Bradley: Yeah. The power of Second Life, a lot of people don't realize. It's not so much the 3D world; it is the network. And so, what you do is these molecules are basically presentations. They just happen to be in 3D instead of in a poster format.

Although they do have to write a little something about what the molecule is, and if they're doing a reaction, they explain what it is, and other people can come and visit it. So, that's one place where there could be an interaction with other organic chemistry students from around the world, or with even teachers who teach chemistry.
Jon: So, this is actually occurring? And if so, can you kind of characterize what that interaction is like? Are people kind of standing around watching a molecule being built in Second Life and having a conversation about chemistry? I mean, is that what's actually happening?
Professor Bradley: Well, the building of the molecule may or may not happen in front of other people. It depends on who happens to be there.
Jon: OK.
Professor Bradley: But, these presentations are left there.
Jon: So, people can walk up to them after the fact and...
Professor Bradley: Yeah. On various islands. Like we have Drexel Island. We have Nature Island, called Second Nature. There's American Chemical Society Island, where I've participated in some of the construction there. So, there are these even chemistry specific resources, where you're pretty likely to find chemists there, especially on a place like American Chemical Society Island. So, definitely there's networking possibilities.
Jon: So, in that case, then, it's a question of, on the one hand, you have synchronous communication, like, "Is the person here now? Can I talk to them?" But, more broadly, there's asynchronous communication, which wouldn't really require Second Life at all. It could just happen on the web, right? So it may be, actually, a broader opportunity.
Professor Bradley: Yeah. I mean, if you think about Second Life as primarily a tool for connecting people together, there are a lot of asynchronous interactions. For example, there's a chemist who, nearby to my lab area on ACS Island, built a peptide. And she was looking at it and wondering if the chirality was right on the different atoms. I wasn't there when she had that issue. But, she IM'ed me. And what happens in Second Life is, if you're not there, it goes to your email.
Jon: Right.
Professor Bradley: So, I get this email. She has a question. So, I log in a couple of hours later. She's not there. But, I go to her peptide, which is in 3D, and I look at it, and I confirm that, yeah, in fact this is the wrong chirality for this amino acid. And so I leave her a message.
Jon: Yeah.
Professor Bradley: And then, maybe next time, we're both there at the same time and we can talk about it synchronously. But, you can absolutely interact with people asynchronously as well.
Jon: Yeah. Although, I guess what I'm getting at here is the fact that this is occurring in Second Life is kind of tangential, in a way, right? Like it all could happen in other ways on the web.
Professor Bradley: The thing is that Second Life is very sticky, so you are more likely to find people doing stuff. Whereas, if you just went to a chat room, for example, I don't think that would work very well in chemistry, because it's difficult to show what's going on. There's not a sense of identity. The avatar effect is actually very powerful, in terms of people identifying with their avatar. And most students, after a couple of times in, want to know how to change their avatar to look like something else.
Jon: Mm hmm.
Professor Bradley: And that's the kind of involvement that you will not find in a lot of these other social software kind of interactions. So, yeah, it is actually different in that sense, I have to admit.
Jon: This is a little bit of a tangent, this whole thing, but since you brought it up. I mean, my sense right now is that, as a mainstream kind of phenomenon, Second Life is gated by the pretty intense amount of UI machinery and sort of conventions that need to be mastered in order to just navigate and be in the space in an effective way. Do you know what I mean? It's like there's a fair amount of overhead there for people to get over.

And I think, we're like a generation or two away from this being a naturalistic enough kind of an interface, that people will spend way less time figuring out how to drive their avatars around and can focus more on what's actually happening in the environment. Do you think that's a reasonable observation?
Professor Bradley: Yeah. That is a very important factor if you consider using Second Life. And that's where the workshops are extremely handy; because the students bring their laptops, and it takes like five to 10 minutes to get a student, totally naive to being able to do everything they need to do walk around, fly around, click on a quiz, build a molecule. And that happens very easily if we're in the same place at the same time.

But yes, if I just tell you, "Go onto Second Life," most people will get pretty frustrated because you end up on these orientation islands, where the purpose is more general. They're trying to show you how to do all kinds of different things. But, I don't need my students to learn all kinds of different things. I just need them to fly, walk, teleport, maybe talk, chat with who they want, and add a friend.
Jon: Yeah.
Professor Bradley: Then, if they want to change their appearance, for example, that's great. But, you shouldn't make that a necessary thing in order to participate in the process.

And again, I can't overemphasize how important it is. That face to face time with the student's laptop can make that process almost painless.
Jon: Or, I guess, actually capturing in a screencast some of the instruction that you do in that face to face environment. So, that also could be...
Professor Bradley: Yeah, absolutely. I haven't done that. I've thought of doing that. I mean, students also have issues with their computer. I can tell if their video card isn't good enough to run Second Life...
Jon: Yeah. Yeah.
Professor Bradley: Yes, the screencast would help a little bit, but it's really nice to have that face to face.

So, yeah. I mean, it's just a factor. And once you take that into account, then it becomes pretty usable. Now, I don't make Second Life mandatory in my class. That's a whole different question.
Jon: Right. Right.
Professor Bradley: If I have 200 students, it's not feasible for me to do that.
Jon: Right. Right. Right.
Professor Bradley: Yeah.
Jon: So, [laughs] I didn't realize you were so deeply into that. That's interesting to hear.

I'm a little bit skeptical about it, just because I'm trying to focus, in my own work and in my own stuff, on things that are, I would say, more accessible to everybody, and where the barriers to entry are lower.

So, here's the thing. Even when you strip away all of the technological barriers so, if everybody had the right gear and the right amount of bandwidth and things like that there's still sort of like conceptual hurdles that people have to get over, right? I mean, you crossed a huge conceptual hurdle when you got into open notebook sites, right? There's no high tech really going on there.

I mean, yeah, you mentioned that while it helps to have these services kind of floating on the net that we can just use for this purpose. But, it's more sort of getting the idea, right? That intuition that you have, that by operating in this transparent way, the science will progress faster and better than it otherwise could, right?
Professor Bradley: That's the hypothesis, yeah.
Jon: Yeah. And that's just a question of getting the idea in your head and then having a sense of how to use some pretty basic, very readily available technologies. So, blogs and wikis had been around for quite some time before this idea started to take hold. You know what I mean? In other words, I'm trying to say it's more of a conceptual breakthrough that isn't really gated to any particular technology at all.
Professor Bradley: Right. And that comes up when we talk about other sciences that may not have the traditional paper notebook. I don't know. I mean, I'm just saying, if you focus on making your science as transparent as possible I like to say, try to have no insider information. In other words, someone observing what you're doing would be able to come to the same conclusions as your internal group.

So, if that gets done with a wiki, that's fine. Everybody's using a slightly different system. And that's what's very interesting to see, that we're learning about not only their science, but also about how they think about science, as they construct their notebook.
Jon: So, tell me this, then, since we would all agree there is an appropriate role for competition and there are also, obviously, lots of ways where cooperation is appropriate, we are always kind of struggling to find that balance. It is not an 'either or' kind of thing. We need to have the right ingredients in the right proportions.

And so, from that perspective how do you see weaving in the appropriate aspects of the competitive nature of science into this environment where, in fact, there is also this transparent work happening? How do you reconcile those things?
Professor Bradley: Well, you are making a supposition that competition is necessary for science to progress, and I don't think that is correct.
Jon: OK. Well, good.
Professor Bradley: Because I think, competition if you have a company and you have a business moral that is based upon intellectual property, then absolutely. Then, you have to build the way that you work in such a way that you are going to be able to protect these compounds and then solve them or whatever you want to do.

In terms of actually getting science done, in the sense that today we are able to do something that we weren't able to do yesterday because we understood something new. Maybe, we didn't understand anything. Maybe, it's purely empirical. Like we just know how to make these compounds precipitate. It will take a year to figure out why exactly, but if we are able to control reality in some new way, then I think, that can be done without competition.
Jon: Setting aside the intellectual property aspect of it, there is just the notion that while being the first one to discover something is really different from being someone who comes along afterward or even someone who made the discovery just slightly after the first person did.

There is a kind of I guess you could call it a game behavior which is, I think, kind of fundamental to human nature. Where I was sort of going with this though is that one of the, I think, still unfulfilled uses of technology in particular the stuff that looks like or can be made to look like games is that there are ways to leverage that sort of game behavior and that competitive instinct which can be useful.

For example, a lot of social software is a way for people who are active and who make contributions to take credit for the contributions that they've made, and there can be positive aspects to the kind of competition that is engendered by that architecture.
Professor Bradley: Yeah, I think, in that sense competition can be useful. Like recently I submitted a proposal to the Gates Foundation. Part of the idea is to give prizes to the first people who predict that a compound would be active and make the compound and then test the compound and it turns out to have a minimal activity. Giving out a prize for the first three people who actually do that, I think, there is an element of competition there, but it is not where people are going to work secretly.
Jon: Exactly. This is really just a kind of this line of thinking here. A while ago I interviewed a guy, Ned Gulley, who works at MatWorks, and he designed a contest for Matlab programmers which have turned into a really interesting thing because it is a contest, but it is also a contest that also happens in framework where ultimately all of the code that you write is transparent.

If you are working toward a solution, you can clone other people's codes. You can tweak it. You can completely change it, and so there is this fascinating kind of interplay between. You know, there are incentives for sharing, and there are also incentives for not sharing. And people are kind of walking the line between those two sets of incentives as they play this game.

It is just a real interesting kind of social experiment in terms of how we leverage the competitive instinct in ways that can be most useful and productive in an environment where ultimately it is all transparent.
Professor Bradley: Yeah, that is a great example. It is very analogous, absolutely. You know, in terms of just talking about the simplicity of the software...
Jon: Yeah.
Professor Bradley: One of my biggest surprises when I was doing this initially, naively I thought that you just use a blog to record the experiments.
Jon: Yeah.
Professor Bradley: It seemed to me to makes sense because you have one post. You can describe one experiment, and then have people comment on it. And then, create a new experiment. That actually didn't work very well, and the reason is there is so much editing that goes on in terms of recording the science. And the wiki is because you are able to access any individual version. You are able to see when conclusions were made, when errors were found and corrected. You are able to see who did each particular contribution.

So, in terms of the software, people do want attribution and using something like a wiki breaks down each person's contribution to such a fine level that, I think, it makes sense for people to participate because you will get credit for exactly what you did. And that is a very special kind of technology that allows you to actually do that. If we did not have wikis I don't think that we would still be doing it at this point.
Jon: Or more broadly, version control.
Professor Bradley: Right, easy version control.
Jon: Yeah. No, I think, that's right, actually. This is an area where relatively few people have yet to experience that effect. Programmers take it for granted. Programmers have been working in source control systems for many years and have at this point an intuitive sense of what it's like to be in an environment where, like you said, each contribution is individually tracked and can be rolled back and so on. Those people who have gotten deeply involved in working in the wiki environment also have developed a sense of that.

But, I think, that is still probably a new experience to most people, the fact that things can work this way that these things can be tracked in this granular way?
Professor Bradley: Yeah, and not everyone is necessarily comfortable working with the wiki right away.
Jon: Right.
Professor Bradley: It is a minor issue that can be addressed, but that's part of the reason that we use multiple technologies. We also use a mailing list, for example. It sounds very 'old school,' but it turns out that it has a very good role to play for certain kinds of interactions, especially debugging stuff. It has to work pretty well.

You know, it's not like one of the technologies took over. It is all these pieces that actually fit very nicely together. They are all free, and anyone who wants to replicate any part of the system can do so overnight, basically.

That's another reason why I worked with the specific tools that I did because it is always frustrating if you see something that you like that you want to do, and oh, you have to buy software. That's a big limitation for people getting involved.

That's another very special thing that has happened in the past years that wasn't there several years ago.
Jon: I think, I read that you are also sort of the e learning coordinator for Drexel at large. Is that true?
Professor Bradley: For the College of Arts and Sciences at Drexel, I am the e learning coordinator.
Jon: Maybe, to kind of close out, you can say a little about how because we have talked a lot about science and chemistry in particular, but what your view is of how these things are playing out in other disciplines at the school.
Professor Bradley: Well, one of the projects that I was involved with pretty heavily is setting up Drexel Island on Second Life. We have about 30 groups on there. And that's been an interesting process, to see what people find to be helpful, what kind of use they make of the technology. A lot of it is basically just advertising their department or class, whatever. And everybody has a different take on it.

And I think, that's been my main focus, in addition to helping people with screencasting. But, in the past year, that's probably been the most intense thing.

You've mentioned that you have some reservations about it, and I absolutely understand that. I was actually not that caught on it for a long time, because it felt like it was using technology just for its own sake. But, I had a friend, Beth Ritter Guth, who was teaching English, and she invited me to her class in Second Life. And once I saw how the avatars interacted together and how it was really different than anything else, I got interested.
Jon: How would you characterize what you saw happening there?
Professor Bradley: It's very personal. In other words, you turn and you can see an avatar. And you have to remember, when you're looking at an avatar, especially for someone who's been around there for a while, that's them. That's who they want to be. Because you can be whatever you want to be. So, you're actually looking at them, in a sense, even more deeply than you would in person, because you can't change everything about yourself in person.

So, there's that very personal connection that you absolutely do not get from a chat room. Because we're still using chat. Or you can use voice. In fact, voice has been a recent addition. And it's kind of interesting: we still don't tend to use it, even though sometimes three or four people are gathered and everybody has voice. You can tell because there's a little white dot that shows up on the avatar's head. But, people still like to use chat.

So, there is something interesting about that, in terms of this kind of barrier. I think, it has to do with some kind of privacy. People still feel a little bit more private when they're using chat.
Jon: We still need to kind of get at why it is that... So, earlier, you said, well, there really isn't an important reason anymore for people to gather together in the physical space of the classroom. At the same time, we're saying, well, there may be a reason why it's interesting and important for people to gather in these virtual spaces. So, why?
Professor Bradley: Well, to meet people. I mean, I think, from my perspective, I introduce my students to organic chemistry. And that has to do with content. It has to do with principles. But, it also has to do with getting involved in the world of chemistry in general.
Jon: OK.
Professor Bradley: And right now, that's the big opportunity with Second Life, is that doing very minimal things, I can bring a couple of students in who may even be tech averse. And then I can see, once they realize that they can meet new people that become interesting. They can meet their classmates, of course, who they might not have seen because they were taking everything online.
Jon: [laughs] You have to admit, there's a certain kind of circularity to this, right? Its like, [laughs] "We could have just gone to the physical place and met. But, we're not doing that, so now we have to go to this virtual place to meet. But, if we're all students at the same university on the same campus, we actually could have met in the same physical place."
Professor Bradley: I think, it has to do with the term "have to." This is just an additional channel that they can use. So, it's not something that they must do. If I hold a class and I have an attendance policy, I'm saying, "You guys must come here." OK?
Jon: Right.
Professor Bradley: And that locks me up from doing other things.
Jon: Yeah. Yeah. Really, what we're trying to get to and you started to go there with this example from the English class. What she was saying, what this teacher was saying is that there are aspects to the quality of interaction that can occur in that environment which are qualitatively different and, in some ways, for some purposes, better. Right?
Professor Bradley: Yes. It all has to do with this long tail, I guess. Most students don't participate in it, and that's absolutely fine. And it benefits a certain sub population of the class; just as having a certain kind of workshop with students benefits them in a way that it doesn't benefit other students.

Using the screencasting, that buys me time. And that's the important part of the screencasting. So, by doing that, I'm able to do the Second Life. I'm able to use molecule models, for example, physical molecular models that I can pull out, and we can look at things that are difficult to see on paper.
Jon: So, in terms of, when you said "long tail, " it sounds like what you really were getting at there was that in a class full of kids, as we know, there are going to be a few who are outgoing and social and will perform well and communicate effectively and actually end up taking up a lot of the airtime in that setting. And this is, again, generally true for many modes of electronic communication, not just Second Life.

But, in these other modes, it becomes possible for people who have sort of different comfort levels, different amounts of extroversion versus introversion, and different styles of communication, to flourish where they might not have flourished so well in the face to face environment.
Professor Bradley: Yeah, exactly. And again, the student is in control of this. I think that they're getting their best value with the model that offers different possibilities.

They take a test after a couple weeks, all right? Now, if they get a poor result on that test, at that point we need to figure out, "What is it that you need to be doing? Maybe you thought you could do this class completely online, but it turns out that, no, you've got to come in for one workshop a week." And I can simply advise them of that. From my experience, I know that's exactly what they need. And all I can do is basically just tell them, "These are the options you have in my class. And if I think that you can benefit from Second Life, I'll help you with doing that."
Jon: That's a good way to put it. Yeah.
Professor Bradley: Yeah. And that's all it is, really.

If you were asking me, "What was the benefit of Second Life?" I was sort of looking at the most beneficial thing that could possibly happen is that they would meet someone that, later on, they would end up doing a co op at a company because they met somebody from there, or they would do a PhD somewhere. And those are the kinds of connections that are kind of difficult to make for students who are undergrads, like sophomores. And I think that that can have a tremendous benefit.

But no, I couldn't expect 200 students to go through that process. I mean, most of the students are busy enough just worrying about getting the basic material down.
Jon: Yeah. Yeah.
Professor Bradley: But, it's an interesting possibility and something that I had not really contemplated before I got involved with it, as to why it is a powerful tool. For me, it comes down to the networking, either in teaching or research.
Jon: Well, we're not going to solve it on this call, but it is still to be explained what's the sort of qualitative difference between this style of interpersonal communication and social networking that happens, or can happen, in Second Life, versus the style that is happening in a variety of other environments online, right?
Professor Bradley: Well, a really good example is a chat room. I'm sure you have a lot of experience with going to a chat room. There's 10 people already having conversations and you don't really know what's going on, and you're trying to figure it out. Well, in Second Life, because everyone has an avatar, you can approach a group of people, and you see who's talking to each other.

You can face a certain person, or you can even IM one of the people from the group. It sounds strange, but there's a huge effect. It really feels like you're actually there. To the left of you are Juan, and to the right of you is Mary. And, that's a completely different feeling than being in a chat room where you can't see anything.
Jon: No, it's true. Setting aside all the 3 D modeling, it does seem to be the notion of what you just said. Here I am and to my left is this person and to my right is that person. There is something kind of deep and fundamental about that, that is a little mysterious, I think, but obviously, powerful.
Professor Bradley: Yeah, and if the person next to you is dressed up as a Goth that tells you a lot about them. You can have a conversation with them that would be totally non existent in the chat room and, you can actually be quiet. So, you can approach a group and everyone will know that you're there, so if they want, they can turn to you and talk to you. And again, that's something that you can't replicate with a traditional chat room.

So, there are a lot of good things you can do. I guess I didn't mention that I've run a couple of conferences using poster presentations in Second Life, and that works really well in terms of having the avatar stand in front of the poster and people come and ask questions.

You can click on the poster to see different slides. And that's something, from a technological standpoint, that's very nice because you can take a PowerPoint and very quickly export the images and upload them in Second Life pretty quickly. So, that's something that I think, is one of the most useful things that you can do with this technology.
Jon: Yeah, that makes a lot of sense because at that level the telepresence is a really effective replacement for taking the plane trip to be at the place. Let's put it this way, on the one hand, it replaces the part where you're standing around looking at the poster and discussing it with the presenter, and that's actually a reasonable substitution. And then, to what extent it substitutes for the conversation that you had in the bar afterwards, your mileage will vary in a whole lot of ways, depending on who you are and who the other person is, and what the environment is.
Professor Bradley: And, what their experience is because if they have a slow video card, you have to realize that they're just spending all their time being frustrated. They're turning right and there are delays. So, those are some of the things that we talk about with new people, and if they're unhappy, we just try to figure out what's going on here. And, if they're computer doesn't have a good video card, there's no point in continuing because it's just too frustrating.
Jon: It's good to hear your thoughts on this because I always like to revisit my prejudices and my biases.
Professor Bradley: I know that feeling.
Jon: You're inviting me to reconsider how I have been thinking about the uses of Second Life.
Professor Bradley: Yeah, let me know if you want a little tour, I'd be happy to give you one. I could show you Drexel Island, certainly, and Nature Island has a lot of good things in terms of science.
Jon: You know, it might be fun to do that and make the movie of it.
Professor Bradley: Yeah.
Jon: Actually, for me, one of the coolest things about Second Life was the camera control, and that there really is this camera there that you can independently move around, point at anything, zoom in or out, and pan. The cinematic graphic possibilities of Second Life are pretty amazing, and I've done a little bit of that, and it would be fun actually to do this.
Professor Bradley: Where did you go?
Jon: Where did I go? Well, actually I was invited to an IBM press conference on Second Life. This is actually what gave me the sense that I still have, which is that a lot of this reminds me of the early days of the web when people would say, breathlessly, come visit my page on the World Wide Web and you're like when?

Corporations were hiring design firms to make their all important home page on the web. And so, when I started, like 10 years later, to hear about companies hiring design firms to build out their Second Life islands, I thought this is just a replay of that and I think, honestly, in some ways, it is.

Actually the movie that I made is probably the funniest thing I've ever done. It's me at this IBM press conference where basically I get there and some guy's showing a PowerPoint that I'm not interested in, and there's a bunch of people I don't know, so I wind up wandering around looking at the fish tank, and eventually I get bored and fly around.
Professor Bradley: You have to have a good reason for going in. Have someone give you a tour. I think, you'll find it much more satisfying.
Jon: Let's do that.
Professor Bradley: Now, what software are you using for recording these days?
Jon: Well, you can do it right in Second Life.
Professor Bradley: The Screencast?
Jon: Yeah, you can do a screen cap right inside of Second Life. Did you know this? It's one of the coolest things.
Professor Bradley: It's built in?
Jon: Yeah, it's built in.
Professor Bradley: I didn't know that.
Jon: You just turn on the recorder.
Professor Bradley: I'll have to check that out.
Jon: Yeah, that part is great.
Professor Bradley: I was using Camtasia.
Jon: No, you don't need to. You can save directly, or you used to be able to. I haven't looked at this in a couple of years.
Professor Bradley: I wonder why I never came across that. That's weird.
Jon: Yeah, you hit the record button.
Professor Bradley: I will definitely check that out. It just records with the camera?
Jon: It records the view.
Professor Bradley: Does it show your little avatar in there?
Jon: It does.
Professor Bradley: Behind your head?
Jon: Yeah.
Professor Bradley: OK, cool.
Jon: That would be kind of cool actually. There are precious few actually I would have to say no examples that I know of where somebody can just go, without the overhead of getting into the environment, and learn how to fly the avatar around. But, somebody can just go and see a nice crystal clear example of how this stuff is being used in an educational context, and can take away a sense of what that mode is doing and what the value of it is.

Well, it's been a pleasure to speak with you. I've been hearing about what you do from all sorts of different people who keep on saying that you should talk to Jean Claude.
Professor Bradley: Well, thank you very much. Actually you're one of the first names that I saw. This is when you did "Umlaut".
Jon: Oh, the Wikipedia movie.
Professor Bradley: That was pretty cool. And then, look at what's happened after a few years.
Jon: Which, of course, was not new almost nothing is really new. But, the idea that these representations of what happens in computer screens can be canned and replayed has become how all kinds of software are explained to people now.
Professor Bradley: Yeah, it's a no brainer once you know how to do it.
Jon: Yeah.
Professor Bradley: OK, thanks.
Jon: Thanks a lot.
Professor Bradley: Bye.

[music]
Announcer: You've been listening to Interviews with Innovators with host, Jon Udell. By joining as a paid member, you'll not only help keep the Conversations Network on the air, you'll also have access to our premium edition programs without promotional messages. Just click on the 'Join Now' button on our website to learn more.

Our audio files are delivered by Limelight Networks, the high performance content delivery network for digital media. The post production audio engineer for this program was Jon Udell. Our website was editor Niels Makel. The series producer is Niels Makel.
Phil: This is Phil Windley. I hope you'll join me next time for another great edition of Interviews with Innovators on IT Conversation.