Wednesday, July 31, 2013

Local Learning Communities, Twitter, and Tobler's Law

Derek Bruff recently posted a summary of this week's discussion of Vanderbilt's local learning group focused on Anthony Robinson's COURSERA geography MOOC (https://www.coursera.org/course/maps). Derek's summary is posted here: https://my.vanderbilt.edu/vandymaps/2013/07/second-week-reflections-social-learning-in-a-mooc/

I wrote a response in Comments that is awaiting moderation, but it's a blog post in itself and I have few spare cycles, and so that comment will do double duty!

---

I have not participated in the local meetings this past two weeks, but I HAVE participated in the local group, thanks to #vandymaps (and thus, twitter), and into the future, maybe more blogging. Apropos your post Derek and this week's discussion, I've been reflecting on the social experience as well. Why have I spent 8 hours each of the past two Sundays on this class, when I am otherwise SLAMMED? This qualifier ("slammed") shouldn't be underestimated in what follows.

I am attempting the MOOC (#mapmooc), in part, because of an interest in geography and I like maps! But that's insufficient to explain 8 hour Sundays, I think -- in fact, I'm sure. 

In part, the attempt is also motivated by the idea that a Director for a digital learning institute should actually take and finish a MOOC, rather than only auditing MOOCs, though I don't think that explains the 8 hour Sundays either -- I could always complete "the next MOOC" rather than this one.

It strikes me that I mostly owe my stick-to-itiveness in this case to #vandymaps.  Initially, #vandymaps was a local learning community (I did skype in from Washington state for the pre-MOOC-kickoff meeting with all my local colleagues, huddled around a table -- very nice). A few of the #vandymaps people I would have counted as friends at the time of our pre-MOOC meeting (and still do, btw!! :-), but most of you I didn't know well, or not at all. Nonetheless, #vandymaps is still grounded in a local learning community (Todd Hughes's big, warm welcome to the group is as an important a qualifier as "being slammed" though with the opposite sentiment). But for me, this local community is one that (a) I am currently operating in virtually through twitter, and that (b) is being expanded through twitter to include others.

Its no accident that I used the #vandymaps hash tag to label the local group!

For me, I had enough social ties with people in #vandymaps initially that even virtual interactions were affective and therefore effective, so I see the "persistence" characteristic as directly causal in my case, but the locality characteristics (in my case) is causal of that. (This group is not shy about computing metaphors, so think Bayesian networks!!! :-)

Another exciting aspect of this MOOC experience is learning twitter. Apropos this, and the influence of "local" and "community", is my nascent but growing interactions on twitter. Here is a tweet that links some themes in these comments:

"28 Jul: Rocketing towards distinction in #mapmooc :-) The local group, #vandymaps, is such a great help -- a variant on Tobler's law!!!"

This has been re-tweeted and favorited by others -- VERY neat, but I only just learned of this (because I am only just learning twitter!). So, this doesn't explain my two previous 8 hour Sundays either, but these and other acknowledgements on Twitter (by non-local, as well as local persons) might be an influence on my future in this course.

But this connection between Tobler's Law and local learning communities is really interesting. Recall that Tobler's Law says that "Everything is related to everything else, but near things are more related than distant things." The Law doesn't ascribe a causal direction. Even though I am operating virtually with #vandymaps, they are "more relevant" to me than the world (at least so far as my #mapmooc behavior to date is concerned)-- its interesting, and will receive much more thought. It's also bringing me back to tried and true sentiments of "think global and act local" and "soldiers fight for their countries but die for their friends", and thinking through more variants on themes of locality, friendship, collegiality, obligation, reputation, etc. More later, I hope!

I see the importance to locality and Tobler's Law (directly or indirectly causal) to other activities too, most recently the 5 hours I spent on a big multi-institution proposal this weekend -- why? Because it was being headed by Vanderbilt, I'm part of this community (though not a co-PI) and "we" were all in this together, though all working virtually and asynchronously (but not through twitter :-) There are studies on locality and collaboration btw -- will dig those up for a scholarly article.

Thanks, Derek, for the post!

Sunday, July 21, 2013

A Principle of Mapping Involving Precision and Accuracy

In the Coursera Map course that I am taking (https://www.coursera.org/course/maps), we were asked to place virtual pins on an interactive world map. Many students drilled down deeply to very specific locations in order to place their pins, some of these deliberately lying a bit so as not to reveal their exact location. Others placed their pins at coarse renderings on the map, without drilling down deeply at all, BUT NONETHELESS their pins appeared in rather arbitrary locations when the user drilled down and the map was viewed at a higher resolution. I wrote the following to the course discussion board, with a few revisions here to contextualize it. These maps represent digital tools that we may want to use in Vanderbilt MOOCs and we may want to improve on them.

----

I was one of those who drilled down to the building level in placing my pin, and was quite exact (an exact location in the city of Nashville). I saw one posted pin that was in the middle of Vanderbilt stadium and other pins looked haphazardly placed. Some of these may have been placed intentionally in (modestly) wrong places, as many have outlined on the boards. But some of these may have been placed in the Nashville region at a COARSE grained (low resolution) map by people who did not drill down deeply at all.

                                    snippet from ESRI-generated map on PennState course site
My placement was precisely where the faculty apartment is at McGill Hall (top right yellow pin), but another person's pin in the middle of Vanderbilt stadium, Dudley Field -- mistake?


These latter pin placements were accurate at the grain level in which they were placed, but (unintentionally) wrong at finer grained renderings of the map.

Shouldn't there be a precision principle of interactive maps like that provided in the class (analogous to rules of precision for floating point numbers I suppose)? Shouldn't there be some functionality for interactive maps, perhaps a research topic, that I can identify data with a region in a low resolution rendering of a map without being wrong at higher resolution when a user drills down. I can think of strategies to do this, but given the ambiguities of regions that this course has already made us aware of, there must surely be some fleshing out of these ideas.

I had the same question a couple of years ago when I started placing pictures on Google maps, most very precise down to a few square feet, but I also wanted to place some pictures on a larger region (e.g., a museum, a park, a city) with NO implication that these pictures were intended to be accurate at a finer grained level. (I wrote about this very fun exercise as well -- see http://aicourses.blogspot.com/2013/07/playing-with-pictures.html; beyond being fun though, its also a story that is illustrative of a point that is central in the Coursera-hosted PennState Map course -- that maps and geospatial tools/concepts can be central in telling stories!!!)

To the geographers out there -- is anyone doing research on implementing methods that enforce a precision principle of interactive maps?

BTW -- I think that a significance of this precision principle relates to issues of privacy. I'm guessing that those people who do NOT want to precisely identify their location, or identify their demographics (e.g., age, gender) with their own location, are probably sensitive to misidentifying someone else's residence with them (or with their demographics) -- at least I think that many would be sensitive to that.

But if placing a pin (or picture or ...) on a low resolution rendering of a map "accidentally" places the pin on someone else's residence at a higher resolution, then this misrepresenting of the makeup of that residence is exactly what could happen!

MapMooc, a Location Anonymity Index, and a World Connectness App

I have "audited" several MOOCs, hanging in there a bit before dropping out because I have just too much going on at work. I am now taking a geography MOOC (https://www.coursera.org/course/maps), and doing it with a  group of Vanderbilt colleagues, which I think will help complete this course. That I loved geography as a kid will help, and that I have a class project in mind will help too -- of creating a map of digital learning resources across the Vanderbilt University campus; I think I will end up using crowdsourcing to create this thematic map, with graduate student, staff, and faculty input.

My wife and I listened to a couple of intro lectures while packing up in Bellevue, WA before returning to Nashville. I spent most of this morning reading course material and reflecting on some of the course questions for Week 1, and contributing to the discussion board. Here are some of the questions.

1) What is scary about potentially losing control over your geospatial privacy?

2) What are some positive things that could come from openly sharing your personal location with others?

3) What about geospatial privacy has really changed over time? 200 years ago, would it have been possible to live in your current location without your friends and family knowing where you were most of the time?


Given that the number of discussion board posts is huge, it seems like a reasonable strategy is to join a "temporally local" cohort of discussants, so I contributed to another thread begun just this morning, then went to a couple of "suggested" threads based on keyword matches, and started my own thread on defining a "Location Anonymity Index" (LAI) in answer to question (3) above. As you might guess, as a computer scientist, I am very inclined towards developing metrics like the LAI for all kinds of things. Here it is with modest revision of my MOOC discussion board post.

3) What about geospatial privacy has really changed over time? 200 years ago, would it have been possible to live in your current location without your friends and family knowing where you were most of the time?

Much of my time is spent at work and home, and I think that the number of people who know precisely where I am for a large amount of time, is NOT that different from 200 years ago. In theory, there might be some people who I don't know (e.g., in government agencies), but who know or can access my location from social networks and cell phones, so the theoretical possibilities about location awareness are certainly different now than from 200 years ago. Also, a trip across town or across country would have left my precise location uncertain for long periods of time 200 years ago, and these intervals of location uncertainty are very much shorter today.

All considered, I wonder if it's reasonable to define a "location anonymity index" (LAI) that represents the expected error in OTHER PEOPLES' GUESSES of MY location (at particular times or across all times). A impractical way of computing my location anonymity index is to ask each person (on Earth!) where I am located and then measure the average difference between where I really am (known by an oracle) and where each person guessed I was. There are all kinds of complications to be worked out, like how "I" am identified (e.g., by name, by picture, by driver's license #, etc), and how "my location" is identified (e.g., at what grain size -- region, city, GPS,…? how difference between each guess and my actual location is measured?).

By this theoretically possible, but practically impossible measure, my location anonymity index (LAI) has probably not changed much in the past 200 years, because the vast majority of other people would be making wildly uninformed guesses, BUT ALSO because the much fewer "interested-in-Doug" persons (my wife, family, co-workers, friends) would make very accurate guesses on my location the vast majority of the time (Doug is home at …, Nashville, TN; Doug is at work at Vanderbilt), so even if we reduce the number of people who we ask to guess my location to those who are "interested", my location anonymity index probably doesn't change much over 200 years if measured in this way.

I think its interesting and important to consider the case where we change the computation of a location anonymity index (LAI) from simply asking people "cold" (uninformed), to the case where we allow each person to use technology available to them (e.g., Google, Bing, libraries, newspapers, supermarket rewards programs) to find important facts about me (e.g., home, work, blogs, …) BEFORE their guess on my location. This variation gets much closer to measuring the "worst case" situation where someone might be looking for me that wants to cause me harm.

In this latter variant, there is a huge difference in my LAI between now and 200 years ago, EVEN when the computation takes into account uncertainties that have to do with my "identity" (e.g., there is more than one Doug Fisher, in the world, in the US, in academia, in Nashville, etc … and this illustrates another point, which is that a LAI should take into account uncertainties in the extent to which my identity can be isolated).

Returning to questions of "geospatial privacy", which variants of LAI would be intended to measure under different assumptions, I'm not too scared about this -- I share lots of information on the Web that is intended to lessen my geospatial privacy, in particular to lessen the LAI as I conceptualize it. Nonetheless, I have been talking so far about the "average" LAI as a measure of "geospatial privacy", whereas I think most people worry about the "worst-case" -- what if someone with ill intent could locate me or who happened upon my location? Again, my latter variant of LAI with technology-informed guessers is intended to capture this worst case. In this regard, I may become much more worried about geolocation privacy as mobile technology continues to develop. For example, I recently took a photo of the Seelback Hotel lobby in Louisville, KY, and when I posted on Facebook I was prompted to tag (label) the face on an anonymous passersby who appeared in the photo!!!

Technology already exists that would allow the computer itself to do the tagging in that photo and other photos, and that possibility is unnerving, but I see some possible upsides for for giving up geospatial privacy. I am thinking about some of the positive messages conveyed by travelers like Rick Steves on world connectedness. But I also think that even some of the very unnerving possibilities like automated computer labeling of faces have some neat possibilities. I imagine jogging in Centennial  Park in Nashville TN 10 years from now, and having my wearable smart glasses tell me that it "thinks" I've seen that same passerby in Centennial Park, at the Vasa Museum in Stockholm, Sweden 5 years previously. I know this sounds scary too, but I find something in that level of connectedness to other people quite intriguing.

That said, I am a lot more worried about privacy along other dimensions, such as health care (probably for reasons that are specific to the US relative to much of the rest of the World) and finances/banking privacy. 

Monday, July 15, 2013

Reusing other Instructor's Assignments ... or not?

I am in the Educational Advances in Artificial Intelligence (EAAI-13), and we just concluded a session on educational repositories, particularly online repositories of homework assignments. Repositories of educational resources is a topic near and dear to my heart, but at least in the case of repositories of homework assignments, there appears to be no, little, or at best weak anecdotal evidence that assignments are being reused. At a minimum, don't we want repositories to be "instrumented,", like my (and everyone's) YouTube channel(s), so I can see downloads, likes, dislikes, and more sophisticated measures of usage that are specific to homework assignments?

Its hard to know if a homework assignment that has been posted in a educational repository is actually used by another instructor, unless an instructor who has used it, gets back to me and tells me so. There is some work in thinking about how to do this. But there is also low hanging fruit. First, we can measure downloads, but beyond this, as an educational community can take a small step towards a scholarly culture surrounding education materials by designing licenses specific to this kind of content.

For example, a license for usage of educational content could require that the material can be used by others (e.g., following any of the principles of creative commons licenses: http://creativecommons.org/), but additionally require that the user report back on the usage to the author (typically, the copyright holder), whether the use is as is, or derivative.

I think that this would be an incredible help to evaluating the extent and manner of use of educational material, going well beyond measuring downloads, and ultimately of evaluating the utility of educational materials to the educational community.

Let's ask people about their use, through a license that requires report back (and nothing else), rather than simply depending of the ability of inference by machine methods.

Sunday, July 7, 2013

A MOOC is NOT a Textbook

When I first started using video lectures from Jennifer Widom's Database MOOC and Andrew Ng's Machine Learning MOOC in Spring 2012 for my courses at Vanderbilt, I was worried about what people might think.  I did at least two things in response. One, I started creating my own online content that others could use; giving back made me feel much better about using the content of others. Two, I cast the MOOCs as "multimedia textbooks", just the next step in a natural evolution (e.g., https://my.vanderbilt.edu/cs390fall2012/). At the time I really did think of a MOOC as something of a "textbook." But the fact is that a MOOC is NOT a textbook, multimedia or otherwise, perhaps an obvious truth that I only reconsidered recently as I prepared for a panel presentation (https://my.vanderbilt.edu/douglasfisher/files/2013/06/DougFisherMOOCsOnCampusPresentation.pdf) at AAUP 2013 (http://www.aaupnet.org/events-a-conferences/annual-meeting/aaup-2013/program).

Others have also advocated MOOCs as multimedia textbooks for reasons of promoting acceptance by skeptics of online education, and while MOOCs may contain material that will become part of such textbooks, MOOCs aren't textbooks per se. I think that MOOC-offering organizations acknowledge this, perhaps implicitly, in that they are entering into agreements with other resource providers, like textbook publishers, so as to augment the MOOC experience (e.g., http://connection.sagepub.com/blog/2013/05/08/sage-coursera-partnership/).

Beyond stating the obvious, I am thinking aloud here about how community-developed, online, multimedia textbooks might arise.

The authors of even a mediocre textbook aspire to be somewhat comprehensive in their coverage of a field, almost always including more material than any one instructor would cover in any one course, and as importantly, a good textbook will synthesize across that material. A good textbook supports large-scale customization, as different instructors at different universities create their own course variations, each anchored by the same textbook. By and large, MOOCs aren't (yet!) designed with customization in mind. Like other courses, a MOOC is a single trajectory through a single selected subset of a field's content. Some MOOCs (and online courses before the 'MO' movement) have multiple tracks, and this is a BIG innovation relative to what's typically done with an on-campus course. Each track is a trajectory, and the differing, albeit overlapping tracks of a MOOC represent a step towards supporting choice and customization, but only a step.

While supporting limited choice, a MOOC still covers only a small part of the material that a typical textbook might cover. A textbook for artificial intelligence (e.g., http://artint.info/html/ArtInt.html), for example, will often cover natural language processing (NLP) by computer, and different computational approaches in this area, but its rarely material that I cover. This is another example of a textbook providing the freedom to follow one of many different tracks, some that include NLP, and some not.

While courses, online or otherwise, support limited choice of trajectories through material, online repositories of educational content, such a TeachingTree (http://www.teachingtree.co/) or even YouTube (http://www.youtube.com/t/education) on a much larger scale, support "unlimited" choice and customization -- these latter resources are UNDER-constrained in terms of choice, just as MOOCs are OVER-constrained, at least if we want to start thinking of such resources as "textbooks." A good textbook, one might argue, is "optimally" constrained, providing enough choice for instructors to follow different trajectories, but also providing enough structure, constraints, and guidance so that the choice is not overwhelming!

Repositories of online educational material are growing, providing ever increasing choice for educators and learners, but as yet, these repositories provide insufficient constraints and guidance on how choice can be effectively navigated for course customization. There is interest and work in using crowd-sourcing, through mechanisms such as Wikimedia, to build structure on top of these resources (e.g., http://www.aaai.org/ocs/index.php/SSS/SSS12/paper/download/4343/4693/), but I think that it will take dedicated authors, working individually or in small groups, to step up to the plate and synthesize across these online resources if we are to see good-quality multimedia textbooks -- resources that effectively tradeoff choice and structure -- not too under or over constrained.

So, perhaps I overreacted in saying that a MOOC is not a textbook. Rather, along this continuum between over and under constraint for purposes of supporting customization and diversity of content and teaching style, a MOOC, or any other course for that matter, lies on the over-constrained end, something of an impoverished textbook at best -- though it may in fact be an excellent course!