Blog

Photos from the MyZeikl Building

While walking around Frankfurt, I found a very interesting building named MyZeikl. Go ahead, Google it. It’s got some interesting architecture.

What struck me the most while there was this mashup of red, metal, and the glass in the ceiling. A close second was the view when looking down from the top floor.

How to apply a filter to an aggregation in Elasticsearch

When using Elasticsearch for reporting efforts, aggregations have been invaluable. Writing my first aggregation was pretty awesome. But, pretty soon after, I needed to figure out a way to run an aggregation over a filtered data set.

As with learning all new things, I was clueless how to do this. Turns out, it’s quite easy. Within a few minutes, I came across some articles that recommended using a top-level query with a filtered argument, which seemed cool because I could just copy my filter up.

That’d look something like:

{
    "query": {
        "filtered": {}
    }
}

But, one of my coworkers pointed out that filtered queries have been deprecated and removed in 5.x. Womp womp. So, the alternative was to just convert the filter to a bool must query.

Here’s an example:

Example

You can find the Shakespeare data set that I’m using, as well as instructions on how to install it here. Using real data and actually running the query seems to help me learn better, so hopefully you’ll find it helpful.

Once you’ve got the data, let’s run a simple aggregation to get the list of unique plays.

GET shakespeare/_search
{
     "aggs": {
      "play_name": {
        "terms": {
          "field": "play_name",
          "size": 200
        }
      },
      "play_count": {
          "cardinality": {
            "field": "play_name"
} } }, "size": 0 }

Based on this query, we can see that there are 36 plays in the dataset, which is one off from what a Google search suggested. I’ll chalk that up to slightly off data perhaps?

Now, if we were to dig through the buckets, we could list out every single play that Shakespeare wrote, without having to iterate over every single doc in the dataset. Pretty cool, eh?

But, what if we wanted to see all plays that Falstaff was a speaker in? We could easily update the query to be something like the following:

GET shakespeare/_search
{
    "query": {
      "bool": {
        "must": {
            "term": {
                "speaker": "FALSTAFF"
} } } }, "aggs": { "play_name": { "terms": { "field": "play_name", "size": 200 } } }, "size": 0 }

In this case, we’ve simply added a top-level query that returns only docs where FALSTAFF is the speaker. Then, we take those docs and run the aggregation. This gives us results like this:

{
   "took": 5,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1117,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "play_name": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "Henry IV",
               "doc_count": 654
            },
            {
               "key": "Merry Wives of Windsor",
               "doc_count": 463
            }
         ]
      }
   }
}

And based on that, we can see that FALSTAFF was in “Henry IV” and “Merry Wives of Windsor”.

Comments

Feel free to leave a comment below if you have critical feedback or if this helped you!

How to retry Selenium Webdriver tests in Mocha

While working on some functional tests for a hosting provider, I kept running into an issue where the login test was failing due to a 500 error. It appeared as if the site hadn’t been fully provisioned by the time my test was trying to login.

Initially, I attempted adding timeouts to give the installation process more time, but that seemed prone to error as well since the delay was variable. Also, with a timeout, I would’ve had to make the timeout be the longest expected time, and waiting a minute or so in a test suite didn’t seem like a good idea.

Getting it done

You think it’d be a quick fix, right? If this errors, do it again.

Within minutes, I had found a setting in Mocha that allowed retrying a test. So, I happily plugged that in, ran the test suite again, and it failed…

The issue? The JS bindings for Selenium Webdriver work off of promises, so they don’t quite mesh with the built-in test retry logic. And not having dug in to promises much yet, it definitely took me a bit to wrap my head around a solution.

That being said, there are plenty of articles out there that talk about retries with JavaScript promises, which helped bring me up to speed. But, I didn’t find any that were for specifically retrying promises with Selenium Webdriver in a Mocha test suite.

So, I learned from a couple of examples, and came up with a solution that’d work in my Selenium Webdriver Mocha tests.

The Code

You can find a repo with the code and dependencies here, but for convenience, I’m also copying the relevant snippets below:

The retry logic

This function below recursively calls itself, fetching a promise with the test assertions, and decrementing the number of tries each time.

Each time the function is called, a new promise is created. In that promise, we use catch so that we can hook into the errors and decide whether to retry the test or throw the error.

Note: The syntax looks a bit cleaner in ES6 syntax, but I didn’t want to set that up.

var handleRetries = function ( browser, fetchPromise, numRetries ) {
    numRetries = 'undefined' === typeof numRetries
        ? 1
        : numRetries;
    return fetchPromise().catch( function( err ) {
        if ( numRetries > 0 ) {
            return handleRetries( browser, fetchPromise, numRetries - 1 );
        }
        throw err;
    } );
};
The test

The original test, without retries, looked something like this:

test.describe( 'Can fetch URL', function() {
    test.it( 'page contains something', function() {
        var selector = webdriver.By.name( 'ebinnion' ),
            i = 1;
        browser.get( 'https://google.com' );
        return browser.findElement( selector );
    } );
} );

After integrating with the retry logic, it now looks like this:

test.describe( 'Can fetch URL', function() {
    test.it( 'page contains something', function() {
        var selector = webdriver.By.name( 'ebinnion' ),
            i = 1;
        return handleRetries( browser, function() {
            console.log( 'Trying: ' + i++ );
            browser.get( 'https://google.com' );
            return browser.findElement( selector );
        }, 3 );
    } );
} );

Note that the only thing we did different in the test was put the Selenium Webdriver calls (which return a promise) inside a callback that gets called from handleRetries. Putting the calls inside this callback allows us to get a new promise each time we retry.

Comments?

Feel free to leave a comment if you have input or questions. Admittedly, I may not be too much help if it’s a very technical testing question, but I can try.

I’m also glad to accept critical feedback if there’s a better approach. Particular an approach that doesn’t require an external module, although I’m glad to hear of those as well.

PHP – Get methods of a class along with arguments

Lately, I’ve been using the command line a lot more often at work. I found two things hard about using the command line to interact with PHP files:

  1. Figuring out the require path every time I opened an interactive shell
  2. Remember what methods were available in a class and what arguments the method expected

The first was pretty easy to handle by writing a function that would require often used files. The second one turned out to not be too hard and is the subject of this post.

The code

Below is the code that I used to get the methods of an object as well as the arguments for each method.

“`
<?php

<?php
function print_object_methods( $mgr ) {
  foreach ( get_class_methods( $mgr ) as $method ) {
    echo $method;
    $r = new ReflectionMethod( $mgr, $method );
    $params = $r->getParameters();

    if ( ! empty( $params ) ) {
      $param_names = array();
      foreach ( $params as $param ) {
        $param_names[] = sprintf( '$%s', $param->getName() );
      }
      echo sprintf( '( %s )', implode(', ', $param_names ) );
    }
    echo "\n";
  }
}

An example

Let’s use the Jetpack_Options class from Jetpack as an example. You can find it here: https://github.com/Automattic/jetpack/blob/master/class.jetpack-options.php

For that class, the above code would output:

get_option_names( $type )
is_valid( $name, $group )
is_network_option( $option_name )
get_option( $name, $default )
get_option_and_ensure_autoload( $name, $default )
update_option( $name, $value, $autoload )
update_options( $array )
delete_option( $names )
delete_raw_option( $name )
update_raw_option( $name, $value, $autoload )
get_raw_option( $name, $default )

As a note, in this case, it could also be nice to print out the docblock for each method instead of just the arguments to add some context. But, I didn’t need too much context for a file that I’m in pretty often. Your mileage may vary.

Join Us in the Fight for Net Neutrality

July 12 is an Internet-wide day of action in support of Net Neutrality. If you share our love of the free and open Internet and want to join the fight to preserve it, please join in!

Please take a moment today to help by:

(1) sending a message of support to the FCC, which you can do by visiting battleforthenet.com and

(2) enabling the Fight for Net Neutrality Plugin on your WordPress site, to show your support and encourage others to take action, too. Instructions can be found on this article.

Source: Join Us in the Fight for Net Neutrality