Recording completed tasks with Alfred

At Automattic, many teams have a process where they post weekly, or biweekly, updates. One of the things that I’ve often found difficult, as I write my personal update, is remembering all of the little things that I did for the past week.

Sure, since I work on the computer, there’s usually some paper trail for what I did. But, getting that paper trail meant that I needed to comb through various sources and then also try to remember the things that didn’t have a paper trail.

One of my favorite tools for getting all of the tasks that I completed in one place, and minimizing the number of things that weren’t tracked, was iDoneThis. But, it’s got a lot more functionality than I need. So, I set out to implement something to track completed tasks locally.

A simple Alfred workflow

Introducing the Dones workflow for Alfred! 🎉

This very simple workflow works by querying done {query}. The workflow will then take over and do the following:

  • Create a new file with a name like 2020-03-31.txt, where 2020-03-31 is the current date
  • Add the done as a new line in that file, prepended with a timestamp. Ex. 4:19:32 PM: hello world

With this setup, you’ll get a single file for each day that you record dones. You can then browse through those in the standard Mac file browser.

Installation

To install the Dones workflow, simply download the workflow from Github and then double click to import it in Alfred.

Future work

At the moment, there is no definite future work planned. That being said, one nice-to-have that is on my mind is adding a command to sum up a period’s dones. For example, maybe something like dones_sum 7 that gets the past 7 day’s worth of dones.

Install Unison 2.48.4 on Mac OS X with Homebrew

I use Unison to sync code between my local machine and my dev servers. To sync between two servers, it requires that the same version of Unison be installed on both servers.

Now, this isn’t usually a big deal, because once you get Unison set up, it’s set up. But, I usually get a bit frustrated when setting up a new development machine and ensuring that it has the same Unison version as my remote server.

Most recently, I needed to get Unison 2.48.4 on my local Mac so that it matched my remote server. BUT, homebrew didn’t support Unison 2.48.4.

So, after getting some feedback from one of my coworkers, we came up with the following. Maybe you’ll find it helpful.

# Get rid of existing Unison
brew uninstall --force unison

# Checkout version of homebrew with Unison 2.48.4
cd /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core
git checkout 05460e0bf3ae5f1a15ae40315940b2d39dd6ac52 Formula/unison.rb

# Install
brew install --force-bottle unison

# Set homebrew-core back to normal
git checkout master
git reset HEAD .
git checkout -- .

NOTE: If you get error: fatal: reference is not a tree: 05460e0bf3ae5f1a15ae40315940b2d39dd6ac52 after running git checkout 05460e0bf3ae5f1a15ae40315940b2d39dd6ac52 Formula/unison.rb, we’ve been able to fix the issue by recloning homebrew-core. If you get the same error, you’ll want to add these steps before retrying starting at the git checkout 05460e0bf3ae5f1a15ae40315940b2d39dd6ac52 Formula/unison.rb command above.

cd /usr/local/Homebrew/Library/Taps/homebrew
rm -rf homebrew-core
git clone https://github.com/Homebrew/homebrew-core.git
cd homebrew-core

Recursively cast to array in PHP

I recently ran into an issue where JSON encoding some objects in my code wasn’t working properly. After experimenting, I realized that casting everything to an array before JSON encoding magically fixed things. 

Casting an object to an array is simple enough:

$variable_to_array = (array) $object_var;

But, what happens when an object or array contains references to other objects or arrays? The answer is that we then need to recursively cast a given input to an array. But, we don’t necessarily want to recursively cast everything to an array. For example, this is what happens when we cast 1 to an array:

return (array) 1;
=> array(1) {
  [0]=>
  int(1)
}

A simple fix is to recursively cast non-scalar values to an array. Here’s an example of how we would do that:

/**
 * Given mixed input, will recursively cast to an array if the input is an array or object.
 *
 * @param mixed $input Any input to possibly cast to array.
 * @return mixed
 */ 
function recursive_cast_to_array( $input ) {
	if ( is_scalar( $input ) ) {
		return $input;
	}

	return array_map( 'recursive_cast_to_array', (array) $input );
}

How to remove files not tracked in SVN

At Automattic, we use SVN and Phabricator for much of our source control needs. One issue that I often run into is a warning about untracked files when creating a Phabricator differential:

You have untracked files in this working copy.

  Working copy: ~/public_html

  Untracked changes in working copy:
  (To ignore this change, add it to "svn:ignore".)
    test.txt

    Ignore this untracked file and continue? [y/N]

This warning’s purpose is to make sure that the differential being created has ALL of the changes so that a file isn’t forgotten when a commit is made. 

But, what if the untracked file(s) are from previously checking out and testing a patch? In that case, this warning is actually a bit annoying. 

The simple fix is to clear out the file(s) that aren’t tracked by SVN, which is as simple as deleting the file(s) since they’re not tracked in SVN. For a single file, that might look like:

rm test.txt

But, what if there are dozens or hundreds of files? I know I certainly wouldn’t want to run the command above dozens or hundreds of times to remove all of the files that aren’t tracked in SVN. Of course, we can automate all of the work by running something like the following ONCE:

svn st | grep '^?' | awk '{print }' | xargs rm -rf

Simply run the above from the root of the project and the untracked files should be removed. The above command is a bit much, so I’d recommend throwing it in an alias, which would look something like this:

alias clearuntracked='svn st | grep '\''^?'\'' | awk '\''{print }'\'' | xargs rm -rf'

Get unique values in file with shell command

Over the past year, there have been a couple of times where I've needed to sort some large list of values, more than 100 million lines in one case. 

In each case, I was dealing with a data source where there was surely duplicate entries. For example, duplicate usernames, emails, or URLs. To address this, I decided to get the unique values from the file before I ran a final processing script over them. This would require sorting all of the values in the given file and then deduping in the resulting groups of values.

This sorting and deduping can be a bit challenging. There are various algorithms to consider and if the dataset is large enough, we also need to ensure that we're handling the data in a way that we don't run out of memory. 

Shell commands to the rescue 🙂

Luckily, there are shell commands that make it quite simple to get the unique values in a file. Here's what I ended up using to get the unique values in a file:

cat $file | sort | uniq

In this example, we are:

  • Opening the file at $file
  • Sorting the file so that duplicates end up in a contiguous block
  • Dedupe so that only one value remains from each contiguous block

Here's another example of this command with piped input:

php -r 'for ( $i = 0; $i < 1000000; $i++ ) { echo sprintf( "%d\n", random_int( 0, 100 ) ); }' | sort -n | uniq

In this example, we are

  • Generating 1,000,000 million random numbers, between 0 and 1,000) on their own lines
  • Sorting that output so that like numbers are together
    • Note that we're using -n here to do an integer sort.
  • Deduping that so that we end up with a unique number on each line

If we wanted know how often each number occurred in the file, we could simple add -c to the end of the command above. The resulting command would be php -r 'for ( $i = 0; $i < 1000000; $i++ ) { echo sprintf( "%d\n", random_int( 0, 100 ) ); }' | sort -n | uniq -c and we would get some output that looked like this:

9880 0
10179 1
9725 2
10024 3
9921 4
9893 5
9945 6
9881 7
9707 8
9955 9
9896 10
9845 11
9928 12
10024 13
10005 14
9834 15
9929 16
9764 17
9795 18
9932 19
9735 20
10082 21
9876 22
9835 23
9748 24
9947 25
9975 26
9841 27
9856 28
9751 29
10138 30
10037 31
10026 32
10128 33
9926 34
9821 35
9990 36
9920 37
9696 38
9886 39
9896 40
9815 41
9924 42
9739 43
9854 44
9936 45
9977 46
9873 47
9824 48
10043 49
10054 50
9870 51
9783 52
9901 53
9819 54
9882 55
10022 56
9899 57
9922 58
9922 59
9902 60
10036 61
9830 62
9792 63
9894 64
10008 65
9774 66
9918 67
9986 68
9814 69
9661 70
10117 71
10046 72
9704 73
10016 74
9601 75
9901 76
9923 77
9931 78
9909 79
9895 80
9771 81
10044 82
10059 83
9864 84
9938 85
9799 86
10006 87
9883 88
9880 89
9837 90
9701 91
9870 92
9998 93
9809 94
9883 95
10144 96
9935 97
9979 98
9922 99
9789 100

What is the JavaScript event loop?

I remember the first time I saw a setTimeout( fn, 0 ) call in some React. Luckily there was a comment with the code, so I kind of had an idea of why that code was there. Even with the comment though, it was still confusing. 

Since then, I’ve read several articles about the event loop and got to a point where I was fairly comfortable with my understanding. But, after watching this JSConf talk by Philip Roberts, I feel like I’ve got a much better understanding.

In the talk, Philip uses a slowed down demonstration of the event loop to explain what’s going on to his audience. Philip also demonstrates a tool that he built which allows users to type in code and visualize all of the parts that make JavaScript asynchronous actions work.

You can check out the tool at http://latentflip.com/loupe, but I’d recommend doing it after watching the video.

How to install Unison 2.48 on Ubuntu

For developing on remote servers, but using a local IDE, I prefer to use Unison over other methods that rely on syncing files via rsync or SFTP.

But, one issue with Unison is that two computers must have the same version to sync. And since Homebrew installs Unison 2.48.4 and apt-get install unison installs something like 2.0.x, this meant I couldn’t sync between my computer and a development machine if I wanted to install Unison via apt-get

No worries, by following the documentation, and a bit more searching, I was able to figure out how to build Unison 2.48.4 on my development server!

Note: I did run into a warning at the end of the build. But, from what I can tell, the build actually succeeded. The second-to-last step below helps you test if the build succeeded.

  • apt-get install ocaml
  • apt-get install make
  • curl -O curl -O https://www.seas.upenn.edu/~bcpierce/unison//download/releases/stable/unison-2.48.4.tar.gz
  • tar -xvzf unison-2.48.4.tar.gz
  • cd src
  • make UISTYLE=text
  • ./unison to make sure it built correctly. You should see something like this:
    Usage: unison [options]
    or unison root1 root2 [options]
    or unison profilename [options]
    
    For a list of options, type "unison -help".
    For a tutorial on basic usage, type "unison -doc tutorial".
    For other documentation, type "unison -doc topics".
    
  • mv unison /usr/local/bin

After going through these commands, unison should be in your path, so you should be able to use unison from any directory without specifying the location of the binary.

How to apply a filter to an aggregation in Elasticsearch

When using Elasticsearch for reporting efforts, aggregations have been invaluable. Writing my first aggregation was pretty awesome. But, pretty soon after, I needed to figure out a way to run an aggregation over a filtered data set.

As with learning all new things, I was clueless how to do this. Turns out, it’s quite easy. Within a few minutes, I came across some articles that recommended using a top-level query with a filtered argument, which seemed cool because I could just copy my filter up.

That’d look something like:

{
    "query": {
        "filtered": {}
    }
}

But, one of my coworkers pointed out that filtered queries have been deprecated and removed in 5.x. Womp womp. So, the alternative was to just convert the filter to a bool must query.

Here’s an example:

Example

You can find the Shakespeare data set that I’m using, as well as instructions on how to install it here. Using real data and actually running the query seems to help me learn better, so hopefully you’ll find it helpful.

Once you’ve got the data, let’s run a simple aggregation to get the list of unique plays.

GET shakespeare/_search
{
     "aggs": {
      "play_name": {
        "terms": {
          "field": "play_name",
          "size": 200
        }
      },
      "play_count": {
          "cardinality": {
            "field": "play_name"
} } }, "size": 0 }

Based on this query, we can see that there are 36 plays in the dataset, which is one off from what a Google search suggested. I’ll chalk that up to slightly off data perhaps?

Now, if we were to dig through the buckets, we could list out every single play that Shakespeare wrote, without having to iterate over every single doc in the dataset. Pretty cool, eh?

But, what if we wanted to see all plays that Falstaff was a speaker in? We could easily update the query to be something like the following:

GET shakespeare/_search
{
    "query": {
      "bool": {
        "must": {
            "term": {
                "speaker": "FALSTAFF"
} } } }, "aggs": { "play_name": { "terms": { "field": "play_name", "size": 200 } } }, "size": 0 }

In this case, we’ve simply added a top-level query that returns only docs where FALSTAFF is the speaker. Then, we take those docs and run the aggregation. This gives us results like this:

{
   "took": 5,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1117,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "play_name": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "Henry IV",
               "doc_count": 654
            },
            {
               "key": "Merry Wives of Windsor",
               "doc_count": 463
            }
         ]
      }
   }
}

And based on that, we can see that FALSTAFF was in “Henry IV” and “Merry Wives of Windsor”.

Comments

Feel free to leave a comment below if you have critical feedback or if this helped you!

How to retry Selenium Webdriver tests in Mocha

While working on some functional tests for a hosting provider, I kept running into an issue where the login test was failing due to a 500 error. It appeared as if the site hadn’t been fully provisioned by the time my test was trying to login.

Initially, I attempted adding timeouts to give the installation process more time, but that seemed prone to error as well since the delay was variable. Also, with a timeout, I would’ve had to make the timeout be the longest expected time, and waiting a minute or so in a test suite didn’t seem like a good idea.

Getting it done

You think it’d be a quick fix, right? If this errors, do it again.

Within minutes, I had found a setting in Mocha that allowed retrying a test. So, I happily plugged that in, ran the test suite again, and it failed…

The issue? The JS bindings for Selenium Webdriver work off of promises, so they don’t quite mesh with the built-in test retry logic. And not having dug in to promises much yet, it definitely took me a bit to wrap my head around a solution.

That being said, there are plenty of articles out there that talk about retries with JavaScript promises, which helped bring me up to speed. But, I didn’t find any that were for specifically retrying promises with Selenium Webdriver in a Mocha test suite.

So, I learned from a couple of examples, and came up with a solution that’d work in my Selenium Webdriver Mocha tests.

The Code

You can find a repo with the code and dependencies here, but for convenience, I’m also copying the relevant snippets below:

The retry logic

This function below recursively calls itself, fetching a promise with the test assertions, and decrementing the number of tries each time.

Each time the function is called, a new promise is created. In that promise, we use catch so that we can hook into the errors and decide whether to retry the test or throw the error.

Note: The syntax looks a bit cleaner in ES6 syntax, but I didn’t want to set that up.

var handleRetries = function ( browser, fetchPromise, numRetries ) {
    numRetries = 'undefined' === typeof numRetries
        ? 1
        : numRetries;
    return fetchPromise().catch( function( err ) {
        if ( numRetries > 0 ) {
            return handleRetries( browser, fetchPromise, numRetries - 1 );
        }
        throw err;
    } );
};
The test

The original test, without retries, looked something like this:

test.describe( 'Can fetch URL', function() {
    test.it( 'page contains something', function() {
        var selector = webdriver.By.name( 'ebinnion' ),
            i = 1;
        browser.get( 'https://google.com' );
        return browser.findElement( selector );
    } );
} );

After integrating with the retry logic, it now looks like this:

test.describe( 'Can fetch URL', function() {
    test.it( 'page contains something', function() {
        var selector = webdriver.By.name( 'ebinnion' ),
            i = 1;
        return handleRetries( browser, function() {
            console.log( 'Trying: ' + i++ );
            browser.get( 'https://google.com' );
            return browser.findElement( selector );
        }, 3 );
    } );
} );

Note that the only thing we did different in the test was put the Selenium Webdriver calls (which return a promise) inside a callback that gets called from handleRetries. Putting the calls inside this callback allows us to get a new promise each time we retry.

Comments?

Feel free to leave a comment if you have input or questions. Admittedly, I may not be too much help if it’s a very technical testing question, but I can try.

I’m also glad to accept critical feedback if there’s a better approach. Particular an approach that doesn’t require an external module, although I’m glad to hear of those as well.

PHP – Get methods of a class along with arguments

Lately, I’ve been using the command line a lot more often at work. I found two things hard about using the command line to interact with PHP files:

  1. Figuring out the require path every time I opened an interactive shell
  2. Remember what methods were available in a class and what arguments the method expected

The first was pretty easy to handle by writing a function that would require often used files. The second one turned out to not be too hard and is the subject of this post.

The code

Below is the code that I used to get the methods of an object as well as the arguments for each method.

“`
<?php

<?php
function print_object_methods( $mgr ) {
  foreach ( get_class_methods( $mgr ) as $method ) {
    echo $method;
    $r = new ReflectionMethod( $mgr, $method );
    $params = $r->getParameters();

    if ( ! empty( $params ) ) {
      $param_names = array();
      foreach ( $params as $param ) {
        $param_names[] = sprintf( '$%s', $param->getName() );
      }
      echo sprintf( '( %s )', implode(', ', $param_names ) );
    }
    echo "\n";
  }
}

An example

Let’s use the Jetpack_Options class from Jetpack as an example. You can find it here: https://github.com/Automattic/jetpack/blob/master/class.jetpack-options.php

For that class, the above code would output:

get_option_names( $type )
is_valid( $name, $group )
is_network_option( $option_name )
get_option( $name, $default )
get_option_and_ensure_autoload( $name, $default )
update_option( $name, $value, $autoload )
update_options( $array )
delete_option( $names )
delete_raw_option( $name )
update_raw_option( $name, $value, $autoload )
get_raw_option( $name, $default )

As a note, in this case, it could also be nice to print out the docblock for each method instead of just the arguments to add some context. But, I didn’t need too much context for a file that I’m in pretty often. Your mileage may vary.

A Year of Google Maps & Apple Maps

I came across a really great article that compares changes in Google Maps and Apple Maps over a year. It’s really great to see how much Google is experimenting and improving their product.

Similar to how a software engineer refactors their code before expanding it, Google has repeatedly refactored the styling of its map as it has added new datasets. And we see this in the evolution of Google Maps’s cartography:

As Google has added more and more datasets, it has continually rebalanced the colors, weights, and intensities of the items already on its map – each time increasing its map’s capacity for more.

Source: A Year of Google Maps & Apple Maps