Failing In So Many Ways


Liang Nuren – Failing In So Many Ways

Crucible: Mission Accomplished?

I think its important to take a look at whether or not Crucible is really making inroads into the players lost during the massive clusterfuck that was Incarna. Afterall, it wasn’t very long ago that blogs like this were being written.  I think this is an interesting comment in it:

One last thing: this chart seems to prove what a stagnant state EVE is in at the moment: other than the +60D peak after Incursion, the 30 day rolling average number of players logged into EVE has not changed from 30,000 in 27 months.

I took the liberty of poking the eve-offline data and putting it into a more useful form.  There’s some massaging of the data – namely I filtered out any row with less than 10,000 players and duplicated the previous day for any missing days.  The goal here was to eliminate momentary “blips” in the chart because Chribba’s site went down, or because Eve had a particularly long patch.  Those simply aren’t interesting data points for me here.

I’m also notoriously terrible at Excel wizardry, so I did most of this in a database – Oracle in particular.  Mostly because it was handy, but also because it gave me the opportunity to learn more about Oracle’s analytic functions.  I came from a PostgreSQL background, and so far at this job I haven’t really had a lot of reason to play with them.  Maybe its not having to do a ton of manual BI requests, or maybe its just because people want the raw data to work their own Excel Wizardry.  Dunno, but at least some small hole in my knowledge is filled.

For what its worth, this is the basic form of the query:

    to_char(logtime, 'YYYY-MM-DD HH24:MI:SS')                                           AS "date",
    raw_pl                                                                              AS "raw_max_players",
    pl                                                                                  AS "adjusted_max_players",
    trunc(avg(pl) over (order by logtime rows between 7-1 preceding and current row))   AS "7 day",
    trunc(avg(pl) over (order by logtime rows between 30-1 preceding and current row))  AS "30 day",
    trunc(avg(pl) over (order by logtime rows between 60-1 preceding and current row))  AS "60 day",
    trunc(avg(pl) over (order by logtime rows between 90-1 preceding and current row))  AS "90 day",
    trunc(avg(pl) over (order by logtime rows between 120-1 preceding and current row)) AS "120 day",
    trunc(avg(pl) over (order by logtime rows between 180-1 preceding and current row)) AS "180 day",
    trunc(avg(pl) over (order by logtime rows between 365-1 preceding and current row)) AS "365 day"
from (
    select                                                  AS logtime,
        max(pl)                                                AS raw_pl,
        last_value(max(pl) ignore nulls) over (order by AS pl
    from days_table t
        left outer join all_tz p on = p.logtime
    where between (select min(logtime) from all_tz) and (select max(logtime) from all_tz)
    group by
    ) x 
order by
    to_char(logtime, 'YYYY-MM-DD HH24:MI:SS')

These are the basic forms of the other tables and views:

create table pl (
    logtime date,
    pl number 

create or replace view all_tz as
select trunc(logtime) as logtime, max(pl) as pl
from pl
where pl >= 10000
group by trunc(logtime)

create or replace view us_tz as
select trunc(logtime) as logtime, max(pl) as pl
from pl
where pl >= 10000 
    and to_char(logtime, 'HH24:MI:SS') != '00:00:00'
    and cast(to_char(logtime, 'HH24') as number) between 0 and 6
group by trunc(logtime)

I remembered one of the things I wanted done the first time around was to split the data and see just how Eve fared in any given timezone.  Now, I admit that I’m mostly interested in this because I play at a slightly odd hour and I find it frustrating to not find anyone to shoot at.  So I took the time to filter the data based on Swearte’s suggestion on Twitter:

  • US TZ: 00:00 to 06:00 Eve Time
  • AUS TZ: 10:00 to 15:00 Eve Time
  • EU TZ: 15:00 to 00:00 Eve Time

If you look carefully at the example timezone view, you can see some peculiar 00:00:00 filtering. I filtered out rows taken exactly at midnight because so much of the data set had the time portion stripped off – and thus I don’t feel that it was directly attributable to any particular timezone. Though looking at the way the EU TZ converges so heavily with the 60 day moving average, I could be persuaded to call it EU TZ.

Here’s the a graph of what I found (courtesy of Google Docs save chart image) – and its very encouraging for my timezone.  From this perspective, it seems that Incarna was a total disaster and Crucible was a smashing success.  There are some fascnating things in the numbers – like for example the US and EU TZ appear to be growing at a very similar rate, but the AUS TZ is taking off like a house afire.

The EU TZ had its all time 60 day rolling average high of 45.5k in Feb 2011, and by Dec 2011 it had dropped to 37.6k – down a hair over 17%.  However, its been rising steadily and is back up to 41.8k – either an 11% improvement over the recent slump or only 8% down from the all time high.

The US TZ had its all time high of 33.1k in Feb 2011, and by October it plunged to 25.3k where it stayed until December.  Since then, its been taking off fairly well and is back up to 29.6k.  This means that the US TZ was down almost 24%, but has grown 17% since December for a net loss of 10.5%.

The AUS TZ had its all time high of 29.3k in May 2011 and was down to 22.7k by October – a 23.5% loss.  By Dec 18th, the AUS TZ was hitting record highs and is now at 32.8.  Crucible has seen almost 45% growth of the AUS TZ since the Incarna Slump – which is over 10% more than its ever been.  This is fantastic news for me, because it means I will soon have more people to shoot when I’m awake.

So I guess at this point there’s nothing to do but congratulate CCP on their hit expansion.

The raw data is available on my blog here:


Filed under: Databases, Eve, Gaming, Software Development, , , , , , ,

Pair Programming, Code Reviews, and Data Warehousing

Code Reviewing

Code reviewing [Wikipedia] is the concept of having some form of peer review of finished code in order to ensure that it does what its supposed to do and that the approach taken to solve the problem was a good one.  There are two really common forms of code review – the formal code review and the lightweight code review.  A formal code review involves a thorough review and understanding of every line of code, frequently by everyone on a team.  Obviously, this is a very heavy process and formal code reviews are considered too time intensive for anything but the most sensitive code; they are considered almost antiquated these days.  Lightweight code reviews tend to be more informal and involve shorter looks at smaller blocks of code – but the danger is that the code review can be meaningless because of “rubber stamping”.  Both formal and informal code reviews have been shown to decrease the defect rate and improve knowledge transfer within a team.

Pair Programming

Pair programming [Wikipedia] is the concept of having two (or more) developers work on the same piece of code at the same time at the same work station.  In a very real way, pair programming is “on the fly” code reviewing – as such it also lowers the defect rate and it improves knowledge transfer.  It’s generally accepted that two programmers get a single piece of work done faster than one, but not twice as fast.  There is a net productivity loss when pair programming, and its hoped that the benefits make up for it.  I’ve personally seen it work a variety of ways, from Driver/Navigator to Test Ping Pong.  In all cases, both parties are expected to fully understand the overall design and code being written.

Data Warehousing

Data Warehousing [Wikipedia] is a branch of computing which involves the creation and care of large stores of data for the purpose of answering questions.  For instance, it is useful to know how many people clicked on a particular ad banner, or how many RC Helicopters were sold at Best Buys in Ohio.  For the Eve readers, killboards are examples of either Data Marts or Data Warehouses – depending on who you ask.  The discipline is closely related to data mining [Wikipedia], which often makes use of a data warehouse.

Most data warehousing is done via Extract, Transform, Load [Wikipedia] processes in databases like  PostgreSQLOracle, and MySQL, though certainly most serious data warehousing is done with a combination of technologies involving  Distributed File Systems [Wikipedia] and Map/Reduce.  To give you some idea of the scales involved in data warehousing: the largest single instance databases in the world weigh in at over 2 PB and data is amazingly scarce about larger data stores.  I’d estimate some of the larger data warehouses in the world weigh in at hundreds of PB now.  Personally, I’ve worked with data warehouses on 500GB and processing millions of facts per day to 150TB+ and processing up to trillions of facts per day.  I’d say your average data warehousing company isn’t likely to see more than 75GB of data per day and will store something on the order of 500GB-2TB.

The Dilemma

The internet debates over pair programming vs code reviews seem to be endless, but most of the teams I’ve encountered practicing some form of XP have a fairly strong preference for pair programming over code reviews.  The argument tends to go that what is really important is the second set of eyes on the code.  Furthermore, pair programming naturally avoids the danger of “rubber stamp” code reviews because its much harder when your reviewer is helping write the code.  These are absolutely valid observations and I’m a big fan of pair programming.

However, I feel like the right answer for a data warehousing team is not to pick one – but to pick both when possible. While this does mean that the process is very slightly heavier, I want to point out that the cost of failure is much higher.  A friend of mine points out that when most SAAS developers make a mistake, they fix it and bounce a web server – but when I make a mistake, we spend three weeks (re)migrating data.  Ultimately what everyone involved  – from product managers to the developers – wants is for the team to deliver results in a timely manner… and really, three weeks is a hell of a delay because you didn’t spend 20 minutes doing a code review.

So please consider the cost of failure when you’re considering whether you should do pair programming or code reviews.

Filed under: Data Warehousing, Databases, Software Development, , , ,

Database Asserts and Oracle Fun

I should begin by explaining that my particular talents tend to lend towards processing extremely large quantities of data.  In my career I’ve grown data warehouses from infantile to quite large – and utilized everything from Postgres to NoSQL and distributed processing frameworks like Hadoop. The bulk of my work experience is with Perl and Python – and I now (intentionally) work at a Java/Oracle/Hadoop shop.  I really felt the need to branch out and understand what it is that Oracle brings to the table that Postgres doesn’t.

The code base at work is a bit old and sometimes interesting.  Most of it’s in PL/SQL, Java, and the ever-present XML.  Despite our best efforts to exterminate PL/SQL, it continues to propagate on occasion – and testing it with the normal Hibernate methods gets tedious really fast.  In an effort to improve the ease of writing tests around this PL/SQL code I have been writing some specialized direct SQL asserts.

Part of being a convenient and easy to use assert is being able to take a variety of types – from whatever Hibernate object to a Date or even a string representation.  While I was working on making sure that ordering was preserved, I ran across this particular behavior in Oracle:

SQL> create table foo (some_date_column date);

Table created.

SQL> insert into foo (some_date_column) values (sysdate);

1 row created.

SQL> commit;

Commit complete.

SQL> select to_char(some_date_column, ‘YYYY-MM-DD HH24:MI:SS’) as some_date_column from foo order by to_char(some_date_column, ‘YYYY-MM-DD HH24:MI:SS’);
select to_char(some_date_column, ‘YYYY-MM-DD HH24:MI:SS’) as some_date_column from foo order by to_char(some_date_column, ‘YYYY-MM-DD HH24:MI:SS’)
ERROR at line 1:
ORA-01722: invalid number

Exciting eh?  Some experimentation with the query revealed that the problem has something to do with aliasing the column as the original column name and then ordering by the formatted column.  Removing the order by or changing the table alias resulted in the query working as expected (though with the change in behavior).

Either way, it wasn’t very useful for my hopefully generic SQL assert and I felt compelled to find a work around.  Fortunately, I was able to come up with one without burning too much time.  Simply wrap the select columns  in a sub select and pull the order by into the outer select – as an added bonus you only have to do the formatting once this way.

It looks like this:

SQL> select some_date_column from ( select to_char(some_date_column, ‘YYYY-MM-DD HH24:MI:SS’) as some_date_column from foo ) x order by some_date_column;

2011-10-28 19:03:02

Fun times with Oracle, eh?

Filed under: Databases, Software Development, , , ,