Skip to content

The Society, Environment and Economics Lab

2015 November 4

seel

I’d like to introduce SEEL, David Anthoff’s nascent lab within the Energy and Resources Group at UC Berkeley. What was initially a ramshackle group of Ph.D. students, associated with David for as little reason that economically minded folk in ERG’s engineering-focused community need to stick together, seems to be growing into a healthy researching machine. Check out the new website for the Society, Environment and Economics Lab.

The current drivers are around FUND, a widely-used integrated assessment model, maintained by David. For a long time, models like this have been black boxes, and FUND is one of the few with open source code. That’s changing with David’s new modeling framework, Mimi, which has allowed him to rewrite FUND as a collection of interconnected components.

I like the vision, and I think it’s implemented in a way that has real legs for shifting the climate impact assessment process into a more open process. But we’ll find out soon. The National Academy of Sciences is meeting soon to discuss the future of the “social cost of carbon”, an influential quantity computed by models like FUND. David is going to try to convince them that the future of impact modeling looks like Mimi. Godspeed.

Making your own duct tape wallet

2015 October 27

Duct Tape wallets are cool, thin and light, and personalizable. The instructions below describe my design, which I think is elegant, and you can modify to your heart’s content.

Step 1.

Measure out the length of the two longest strips of duct tape:

Line four bills up, just touching along their long edges. Rip two
small strips of duct tape to measure an additional width to the left
and right of the four bills, or use credit cards, as shown below.

How to measure the backbone

Measuring the backbone

 

 

 

 

 

 

Step 2.

Measure out one strips of duct tape this length and lay it sticky-side-up.
Then measure a second strip and lay it stick-side-up with just
enough overlap to form a secure connection.

The backbone diagramThe backbone

 

 

 

 

Step 3.

Fold the strips into the basic wallet frame, by first folding them
in half, with the sticky-side out. Then continue folding in an
accordian fashion, only allowing the faces with the same letter
shown below to stick together. Make sure that these adhering faces
are smooth an even.

Folding faces Folding result

First fold

The first fold, in half, with a bill to measure the second fold.

Second fold

After the second fold.

 

 

 

 

 

 

 

 

 

 

 

 

Flip over

After the third fold and flipping over.

Final backbone

After the rest of the backbone folds.

 

 

 

 

 

 

Step 4.

Measure out a length of duct tape a little larger than twice the
width of the wallet and wrap it around the outside, with the
sticky-side covering the remaining stick-side of the wallet frame.

The wrapper diagram The wrapper

 

 

 

 

 

That’s it!  Enjoy your new wallet!

Final wallet

One console to rule them all

2015 September 26

I love text consoles. The more I can do without moving a mouse or opening a new window, the better. So, when I saw XKCD’s command-line interface, I grabbed the code and started to build new features into it, as my kind of browser window to a cyber world of text.

I want to tell you about my console-based time-management system, the entertainment system, the LambdaMOO world, the integration with my fledgling single-stream analysis toolbox. But the first step was to clean out the password-protected stuff, and expose the console code for anyone who wants it.

So here it is! Feel free to play around on the public version, http://console.existencia.org/, or clone the repository for your own.

screenshot

Here are the major changes from the original XKCD code by Chromacode:

  • Multiple “shells”: I currently just have the Javascript and XKCD-Shell ones exposed. Javascript gives you a developer-style javascript console (but buggy). You can switch between the two by typing x: and j:.
  • A bookmark system: ln URL NAME makes a new bookmark; ls lists the available bookmarks, and cd NAME opens a bookmark.
  • A login/registration system: Different users can have different bookmarks (and other stuff). Leave ‘login:’ blank the first time to create a new account.
  • Some new commands, but the only one I’m sure I left in is scholar [search terms] for a Google Scholar search.

Share, expand, and enjoy!

Labor Day 2015: More hours for everyone

2015 September 9

In the spirit of Labor Day, I did a little research into Labor issues. I wanted to explore how much time people spent either at or in transit to work. Ever since the recession, it seems like we are asked to work longer and harder than ever before. I’m thinking particularly of my software colleagues who put in 60 hour weeks as a matter of course, and I wanted to know if it’s true across sectors. Has the relentless drive for efficiency in the US economy taken us back to the limit of work-life balance?

I headed to the IPUMS USA database and collected everything I could find on the real cost of work.

When you look at average family working hours (that is, including averaged with spouses for couples), there’s been a huge shift, from an average of 20-25 hours/week to 35-40. If those numbers seem low, note that this is divided across the entire year, including vacation days, and includes many people who are underemployed.

The graph below shows the shift, and that it’s not driven by specifically employees or the self-employed. The grey bands show one standard deviation, with a huge range that is even larger for the self-employed.

klass

So who has been caught up in this shift? Everyone, but some industries and occupations have seen their relative quality of life-balance shift quite a bit. The graph below shows a point for every occupation-and-industry combination that represents more than .1% of my sample.

hours

In 1960, you were best off as a manager in mining or construction; and worst as a laborer in the financial sector. While that laborer position has gotten much worse, it has been superseded in hours by at least two jobs: working in the military, and the manager position in mining that once looked so good. My friends in software are under the star symbols, putting in a few more hours than the average. Some of the laboring classes are doing relatively well, but still have 5 more hours of work a week than they did 40 years ago.

We are, all of us, more laborers now than we were 60 years ago. We struggle in our few remaining hours to maintain our lives, our relationships, and our humanity. The Capital class is living large, because the rest of us have little left to live.

The role of non-empirical science

2015 September 2

The New York Times has an op-ed today about that argues “Psychology Is Not in Crisis, in response to the response to a paper that tried and failed to reproduce 60 of 100 psychology experiments. I have been thinking for a long time about the importance of falsifiability in science, and the role of the many kinds of research we do in light of it.

I was recently re-perusing Collins et al. 2010, which purports to address the need for an integrated approach to environmental science, with a new conceptual framework. The heart of the framework is the distinction between “pulse” and “press” dynamics. I do not want to explain the difference here though. I want to know if we learn something from it.

Knowledge comes in many forms. There’s empirical knowledge, facts about the world that we know could not have known until they were observed; analytical knowledge, resulting from the manipulation of logical constructs; and wisdom, inarticulable knowledge that comes from experience.

The Collins et al. paper uses analysis, but it proves no theorems. But of course analysis can be a powerful tool without mathematical analytics. Recognizing multiple parts of a whole can open doors in the mind, and provide substance to a question. Nonetheless, the criteria for science of the usefulness of analysis is, does it allow us to learn something we did not already know? Knowing that fire is a pulse dynamic while climate change is a press dynamic could come in handy, if these categories added additional knowledge.

I claim that papers like this do not try to teach analytical knowledge, although they focus on a piece of analysis. Their goal is to expand our wisdom, by giving it shape. The distinction is not tied to anything we did not already know about fire and climate change. Like a professor who notices two things being conflated, the paper tries to expand our vocabulary and through it our world. Alas, it is exactly the wherewithal to shape our conceptual world that constitutes the wisdom sought. Pulse and press dynamics are one nice distinction, but there are so many others that might be relevant. Having a distinction in mind of pulse and press dynamics is only useful if I can transcend it.

Knowledge builds upon itself, and naturally bleeds between empirics, analysis, and wisdom. I am not a psychologist, but I presume that they are seeking knowledge in all of its forms. The discovery that 60 empirical building blocks were not as sure as they appeared does not undermine the process of science in psychology, and indeed furthers it along, but I hope that it undermines psychology-the-field, and the structure of knowledge that it has built.

Crop categories

2015 August 25
tags:

One thing that makes agriculture research difficult is the cornucopia of agricultural products. Globally, there are around 7,000 harvested species and innumerable subspecies, and even if 12 crops have come to dominate our food, it doesn’t stop 252 crops from being considered internationally important enough for the FAO to collect data on.

Source: Dimensions of Need: An atlas of food and agriculture, FAO, 1995

Source: Dimensions of Need: An atlas of food and agriculture, FAO, 1995

It takes 33 crop entries in the FAO database to account for 90% of global production, of which at 5 of those entries include multiple species.

Global production (MT), Source: FAO Statistics

Global production (MT), Source: FAO Statistics

Worse, different datasets collect information on different crops. Outside of the big three, there’s a Wild West of agriculture data to dissect. What’s a scientist to do?

The first step is to reduce the number of categories, to more than 2 (grains, other) and less than 252. By comparing the categories used by the FAO and the USDA, and also considering categories for major datasets I use, like the MIRCA2000 harvest areas and the Sacks crop calendar (and using a share of tag-sifting code to be a little objective), I came up with 10 categories:

  • Cereals (wheat and rice)
  • Coarse grains (not wheat and rice)
  • Oilcrops
  • Vegetables (including miscellaneous annuals)
  • Fruits (including miscellaneous perennials– plants that “bear fruit”)
  • Actives (spices, psychoactive plants)
  • Pulses
  • Tree nuts
  • Materials (and decoratives)
  • Feed

You can download the crop-by-crop (and other dataset category) mapping, currently as a PDF: A Crop Taxonomy

Still, most of these categories admit further division: fruits into melons, citrus, and non-citrus; splitting out the subcategory of caffeinated drinks from the actives category. What we need is a treemap for a cropmap! The best-looking maps I could make were using the R treemap package, shown below with rectangles sized by their global harvest area.

treemap

You can click through a more interactive version, using Google’s treemap library.

What does the world look like, with these categories? Here, it is colored by which category the majority production crop falls into:

majorities

And since that looks rather cereal-dominated to my taste, here it is just considering fruits and vegetables:

fruitveggie

For now, I will leave the interpretation of these fascinating maps to my readers.

Economic Risks of Climate Change Book out tomorrow!

2015 August 11

The research behind the Risky Business report will be released as a fully remastered book, tomorrow, August 11!  This was a huge collaborative effort, led by Trevor Houser, Solomon Hsiang, and Robert Kopp, and coauthored with nine others, including me:

Economic Risks of Climate Change

From the publisher’s website:

Climate change threatens the economy of the United States in myriad ways, including increased flooding and storm damage, altered crop yields, lost labor productivity, higher crime, reshaped public-health patterns, and strained energy systems, among many other effects. Combining the latest climate models, state-of-the-art econometric research on human responses to climate, and cutting-edge private-sector risk-assessment tools, Economic Risks of Climate Change: An American Prospectus crafts a game-changing profile of the economic risks of climate change in the United States.

The book combines an exciting new approach to solidly ground results in data with an extensive overview of the world of climate change impacts. Take a look!

Guest Post: The trouble with anticipation (Nate Neligh)

2015 July 2

Hello everyone, I am here to do a little guest blogging today. Instead of some useful empirical tools or interesting analysis, I want to take you on a short tour through of the murkier aspects of economic theory: anticipation. The very idea of the ubiquitous Nash Equilibrium is rooted in anticipation. Much of behavioral economics is focused on determining how people anticipate one another’s actions. While economists have a pretty decent handle on how people will anticipate and act in repeated games (the same game played over and over) and small games with a few different decisions, not as much work has been put into studying long games with complex history dependence. To use an analogy, economists have done a lot of work on games that look like poker but much less work on games that look like chess.

One of the fundamental problems is finding a long form game that has enough mathematical coherence and deep structure to allow the game to be solved analytically. Economists like analytical solutions when they are available, but it is rare to find an interesting game that can be solved by pen and paper.

Brute force simulation can be helpful. Simply simulating all possible outcomes and using a technique called backwards induction, we can solve the game in a Nash Equilibrium sense, but this approach has drawbacks. First, the technique is limited. Even with a wonderful computer and a lot of time, there are some games that simply cannot be solved in human time due to their complexity. More importantly, any solutions that are derived are not realistic. The average person does not have the ability to perform the same computations as a super computer. On the other hand, people are not as simple as the mechanical actions of a physics inspired model.

James and I have been working on a game of strategic network formation which effectively illustrates all these problems. The model takes 2 parameters (the number of nodes and the cost of making new connections) and uses them to strategically construct a network in a decentralized way. The rules are extremely simple and almost completely linear, but the complexities of backwards induction make it impossible to solve by hand for a network of any significant size (some modifications can be added which shrink the state space to the point where the game can be solved). Backwards induction doesn’t work for large networks, since the number of possible outcomes grows at a rate of (roughly) but what we can see is intriguing. The results seem to follow a pattern, but they are not predictable.

The trouble with anticipation

 

Each region of a different color represents a different network (colors selected based on network properties). The y-axis is discrete number of nudes in the network. The x axis is a continuous cost parameter. Compare where the color changes as the cost parameter is varied across the different numbers of nodes. As you can see, switch points tend to be somewhat similar across network scales, but they are not completely consistent.

Currently we are exploring a number of options; I personally think that agent-based modeling is going to be the key to tackling this type of problem (and those that are even less tractable) in the future. Agent based models and genetic algorithms have the potential to be more realistic and more tractable than any more traditional solution.

Google Scholar Alerts to RSS: A punctuated equilibrium

2015 May 14

If you’re like me, you have a pile of Google Scholar Alerts that you never manage to read. It’s a reflection of a more general problem: how do you find good articles, when there are so many articles to sift through?

I’ve recently started using Sux0r, a Bayesian filtering RSS feed reader. However, Google Scholar sends alerts to one’s email, and we’ll want to extract each paper as a separate RSS item.

alertemail

Here’s my process, and the steps for doing it yourself:

Google Scholar Alerts → IFTTT → Blogger → Perl → DreamHost → RSS → Bayesian Reader

  1. Create a Blogger blog that you will just use for Google Scholar Alerts: Go to the Blogger Home Page and follow the steps under “New Blog”.
  2. Sign up for IFTTT (if you don’t already have an account), and create a new recipe to post emails from scholaralerts-noreply@google.com to your new blog. The channel for the trigger is your email system (Gmail for me); the trigger is “New email in inbox from…”; the channel for the action is Blogger; and the title and labels can be whatever you want as along as the body is “{{BodyPlain}}” (which includes HTML).

    ifttttrigger

  3. Modify the Perl code below, pointing it to the front page of your new Blogger blog. It will return an RSS feed when called at the command line (perl scholar.pl).

    rssfeed

  4. Upload the Perl script to your favorite server (mine, http://existencia.org/, is powered by DreamHost.
  5. Point your favorite RSS reader to the URL of the Perl script as an RSS feed, and wait as the Google Alerts come streaming in!

Here is the code for the Alert-Blogger-to-RSS Perl script. All you need to do is fill in the $url line below.

#!/usr/bin/perl -w
use strict;
use CGI qw(:standard);

use XML::RSS; # Library for RSS generation
use LWP::Simple; # Library for web access

# Download the first page from the blog
my $url = "http://mygooglealerts.blogspot.com/"; ### <-- FILL IN HERE!
my $input = get($url);
my @lines = split /\n/, $input;

# Set up the RSS feed we will fill
my $rss = new XML::RSS(version => '2.0');
$rss->channel(title => "Google Scholar Alerts");

# Iterate through the lines of HTML
my $ii = 0;
while ($ii < $#lines) {
    my $line = $lines[$ii];
    # Look for a <h3> starting the entry
    if ($line !~ /^<h3 style="font-weight:normal/) {
        $ii = ++$ii;
        next;
    }

    # Extract the title and link
    $line =~ /<a href="([^"]+)"><font .*?>(.+)<\/font>/;
    my $title = $2;
    my $link = $1;

    # Extract the authors and publication information
    my $line2 = $lines[$ii+1];
    $line2 =~ /<div><font .+?>([^<]+?) - (.*?, )?(\d{4})/;
    my $authors = $1;
    my $journal = (defined $2) ? $2 : '';
    my $year = $3;

    # Extract the snippets
    my $line3 = $lines[$ii+2];
    $line3 =~ /<div><font .+?>(.+?)<br \/>/;
    my $content = $1;
    for ($ii = $ii + 3; $ii < @lines; $ii++) {
        my $linen = $lines[$ii];
        # Are we done, or is there another line of snippets?
        if ($linen =~ /^(.+?)<\/font><\/div>/) {
            $content = $content . '<br />' . $1;
            last;
        } else {
            $linen =~ /^(.+?)<br \/>/;
            $content = $content . '<br />' . $1;
        }
    }
    $ii = ++$ii;

    # Use the title and publication for the RSS entry title
    my $longtitle = "$title ($authors, $journal $year)";

    # Add it to the RSS feed
    $rss->add_item(title => $longtitle,
                   link => $link,
                   description => $content);
        
    $ii = ++$ii;
}

# Write out the RSS feed
print header('application/xml+rss');
print $rss->as_string;

In Sux0r, here are a couple of items form the final result:

sux0rfeed

Scripts for Twitter Data

2015 April 22

Twitter data– the endless stream of tweets, the user network, and the rise and fall of hashtags– offers a flood of insight into the minute-by-minute state of the society. Or at least one self-selecting part of it. A lot of people want to use it for research, and it turns out to be pretty easy to do so.

You can either purchase twitter data, or collect it in real-time. If you purchase twitter data, it’s all organized for you and available historically, but it basically isn’t anything that you can’t get yourself by monitoring twitter in real-time. I’ve used GNIP, where the going rate was about $500 per million tweets in 2013.

There are two main ways to collect data directly from twitter: “queries” and the “stream”. Queries let you get up to 1000 tweets at any point in time– whichever the most recent tweets that match your search criteria. The stream gives you a fraction of a percent of tweets continuously, which very quickly adds up, based on filtering criteria.

Scripts for doing these two options are below, but you need to decide on the search/streaming criteria. Typically, these are search terms and geographical constraints. See Twitter’s API documentation to decide on your search options.

Twitter uses an athentication system to identify both the individual collecting the data, and what tool is helping them do it. It is easy to register a new tool, whereby you pretend that you’re a startup with a great new app. Here are the steps:

  1. Install python’s twitter package, using “easy_install twitter” or “pip install twitter”.
  2. Create an app at https://apps.twitter.com/. Leave the callback URL blank, but fill in the rest.
  3. Set the CONSUMER_KEY and CONSUMER_SECRET in the code below to the values you get on the keys and access tokens tab of your app.
  4. Fill in the name of the application.
  5. Fill in any search terms or structured searches you like.
  6. If you’re using the downloaded scripts, which output data to a CSV file, change where the file is written, to some directory (where it says “twitter/us_”).
  7. Run the script from your computer’s terminal (i.e., python search.py)
  8. The script will pop up a browser for you to log into twitter and accept permissions from your app.
  9. Get data.

Here is what a simple script looks like:

import os, twitter

APP_NAME = "Your app name"
CONSUMER_KEY = 'Your consumer key'
CONSUMER_SECRET = 'Your consumer token'

# Do we already have a token saved?
MY_TWITTER_CREDS = os.path.expanduser('~/.class_credentials')
if not os.path.exists(MY_TWITTER_CREDS):
    # This will ask you to accept the permissions and save the token
    twitter.oauth_dance(APP_NAME, CONSUMER_KEY, CONSUMER_SECRET,
                        MY_TWITTER_CREDS)

# Read the token
oauth_token, oauth_secret = twitter.read_token_file(MY_TWITTER_CREDS)

# Open up an API object, with the OAuth token
api = twitter.Twitter(api_version="1.1", auth=twitter.OAuth(oauth_token, oauth_secret, CONSUMER_KEY, CONSUMER_SECRET))

# Perform our query
tweets = api.search.tweets(q="risky business")

# Print the results
for tweet in tweets['statuses']:
    if not 'text' in tweet:
        continue

    print tweet
    break

For automating twitter collection, I’ve put together scripts for queries (search.py), streaming (filter.py), and bash scripts that run them repeatedly (repsearch.sh and repfilter.sh). Download the scripts.

To use the repetition scripts, make the repetition scripts executable by running “chmod a+x repsearch.sh repfilter.sh“. Then run them, by typing ./repfilter.sh or ./repsearch.sh. Note that these will create many many files over time, which you’ll have to merge together.