git is too complicated for me

17 November 2009

git is too complicated for me. It’s supposed to be awesome, but I just can’t figure it out.


Something I enjoy

31 July 2009

Something I enjoy:

Finding a class that implements what I was going to write by hand.

I’m looking at you, ReentrantReadWriteLock.


Programming is like

14 May 2009

Programming is like Buddhism. No one can tell you how to do it. You must do it for yourself. You are the only one who can get in touch with your own reality.

Programming is like the Matrix. There is a difference between knowing the path and walking it. I can only show you the door. You have to go through it.

Programming is like mathematics. There is a crucial relationship between syntax and semantics that fills the soul with ecstasy.

Programming is like the brain. The line is blurred between what is static and what is dynamic.

Programming is like the mind. It is everything and it is nothing. It is you and me, and everything in between.

Programming is like music. The answers are out there, it just requires the right kind of person to capture it and tell it to the world.

The answer to life, the universe, and everything is out there, it just takes some time for the global consciousness to understand everything there is to understand. Communication is critical.


Instead of study

30 April 2009

Things I want to do right now instead of study:

  • Lie on the beach.
  • Be hot in the sun.
  • See a lake or a mountain or a sunset.
  • Read about cognitive science.
  • Write a program in Python.
  • Listen to Dark Side of the Moon on repeat at maximum volume while driving.
  • Play piano.
  • See people I haven’t seen in 9-12 months.
  • Go to a concert.
  • Create a beautiful proof in computer science.
  • Get a melodica.

Scanning for email addresses

9 March 2009

People who want to put an email address up on a web page often write it out like this

example at example dot com

or use some variation. The purpose is to effectively “hide” the email address from an eeevil program which searches the web, looking for email addresses so that it can send them spam about Viagra and whatnot. This seems to be a leftover from earlier days of the Internet, cause it should be pretty simple for a computer program to figure out an email address from a string of that form. For example, the following simple regular expression will match an email like the one above:

.*at.*dot.*

This is, of course, non-comprehensive and would match false positives in many cases, but it is just an extremely simple example which matches the test string I gave above. One could easily modify the email address regular expression given at http://www.regular-expressions.info/email.html to include addresses with the @ and . symbols written out as English words.

I have seen a couple of seemingly effective deterrents from eeevil programs which might use a regular expression such as this to find email addresses.

  1. Use JavaScript to dynamically write the email address in the user’s web browser when the page is loaded (so that the email address is not stored on the web server as a string in the static text of the web page being served). For example, the following JavaScript will write out the email address given above, but the result is not immediately obvious from the code:
    
    var f = "example at";
    f = f + "example dot com";
    document.write(f);
    

    If you were really committed to scraping email addresses from webpages, you could just give your eeevil program access to a JavaScript interpreter, interpret the JavaScript on the webpage, then check the output for email addresses in the usual way.

  2. Create an image representing the email address, à la Facebook. Using PHP with the GD image creation library, this becomes quite simple. For example, the following PHP script creates an image which includes the example email address given above:
    <?php
    // we are sending out an image header, because once this script has been
    //  interpreted, it will yield a PNG image in memory
    header("Content-type: image/png");
    
    // get the text to write
    $string = "example@example.com";
    
    // determine the width and height for the image which will be created
    $width = strlen($string) * 6;
    $height = 16;
    
    // create the image
    $image = imagecreatetruecolor($width, $height);
    
    // define some colors
    $white = imagecolorallocate($image, 0xFF, 0xFF, 0xFF);
    $black = imagecolorallocate($image, 0x00, 0x00, 0x00);
    
    // make the background white
    imagefilledrectangle($image, 0, 0, 199, 99, $white);
    
    // write the string
    imagestring($image, 2, 0, 0, $string, $black);
    
    // write the image
    imagepng($image);
    
    // free the memory
    imagedestroy($image);
    ?>
    

    The output will look something like this:email
    This is roughly how your email address is displayed under your “Info” on your Facebook profile.

    Again, if you were really committed to scraping email addresses from webpages for some reason, I suppose you could use a program that involves optical character recognition to convert the images to text and then spam away.

  3. Require understanding of context from surrounding text. This is probably the only way to effectively hide an email address (or whatever) from an eeevil program, because artificial intelligence research has quite a ways to go until a computer program can “understand” natural language in the same way that we can. A simple example of understanding from context might involve me saying, “My favorite email client is Google’s mail client. My email address is my first name and my last name separated by a period, at my favorite mail client.” It’s not too hard for us humans to figure that out, but the robots never even had a chance.

Duff’s device

3 March 2009

Duff’s device is an optimization in C for copying bytes by means of loop unrolling. It is a clever use of C’s notoriously lax syntax. Note that the inventor of this method, Tom Duff (“Duffman thrusting in the direction of the problem!”), was originally copying a series of bytes into a single destination register, so the to pointer in his code below is never incremented:

dsend(to, from, count)
char *to, *from;
int count;
{
  int n = (count + 7) / 8;
  switch (count % 8) {
    case 0: do { *to = *from++;
    case 7:      *to = *from++;
    case 6:      *to = *from++;
    case 5:      *to = *from++;
    case 4:      *to = *from++;
    case 3:      *to = *from++;
    case 2:      *to = *from++;
    case 1:      *to = *from++;
               } while (--n > 0);
  }
}

Cute right? Here’s an example in which I use it to copy a string (notice that I increment the destination pointer in Duff’s device, resulting in an exact copy of the source string):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

#define SRC_STRING "Hello, world!"

int main() {
  char *src = NULL, *dest = NULL;
  int size = 0, n = 0;

  // define the source string
  src = SRC_STRING;

  // get the size of the source string plus one for the null string terminator
  size = strlen(src) + 1;

  // allocate memory for the destination string
  dest = malloc(sizeof(char) * size);

  // Duff's device, using loop unrolling for incremental copy of bytes
  n = (size + 7) / 8;
  switch (size % 8) {
    case 0: do { *(dest++) = *(src++);
    case 7:      *(dest++) = *(src++);
    case 6:      *(dest++) = *(src++);
    case 5:      *(dest++) = *(src++);
    case 4:      *(dest++) = *(src++);
    case 3:      *(dest++) = *(src++);
    case 2:      *(dest++) = *(src++);
    case 1:      *(dest++) = *(src++);
               } while (--n>0);
  }

  // the source and destination pointers are currently pointing to the end
  //  of their respective strings; this brings them back to the start
  dest -= size;
  src -= size;

  // assert that the two strings are equal
  assert(strcmp(src, dest) == 0);

  // output the source and destination strings
  printf("%sn", src);
  printf("%sn", dest);

  return 0;
}

Save this code (for example, as duff.c), then compile it and run it with


cc -o duff duff.c && ./duff

Update: further explanation is given below.


Spinning fans make me happy

16 February 2009

I get an inordinate amount of enjoyment out of writing a computationally expensive program, making my computer execute it, and then hearing the computer’s fans start spinning frantically as heat build up in the processor, RAM, etc.


People doing good things

13 October 2008

Some developers in Africa (or at least with an interest in helping African people) are developing a platform for accumulating crisis information called Ushahidi. This is a good thing. It makes me feel happy when people do things to help other people. That’s how I know that people helping other people is a good thing to do.

In the spirit of software freedom, the idea with this system is (1) free flow of information and (2) power in numbers. Things are easier when knowledge is freely available and when everyone can contribute and learn. Keep an eye out for projects like this, especially those popping up in African countries and other developing nations, and help them survive.


What I do when I’m sad

6 October 2008

Sometimes when I’m feeling down, I take a look at the changelog for the most recent development release of Wine. Just look at all those beautiful bug fixes and implemented features. Wine is one of the most actively developed projects around, and its cool to see real improvement going down.

Makes me remember that this is paradise.