Skip to main content

Jupyter client library

TL;DR I have been thinking about alternative frontends to the Jupyter notebook. The first step is to abstract communications with Jupyter kernels. I have started writing one of these. Project repository I am a huge fan of the Jupyter notebook environment. It is a fantastic way to explore a new data analysis approach, and keep results together with the implementation and explanation. I wanted to understand how the Jupyter notebook/lab frontend interacts with the Kernel backends, and as such have started writing a client library.

Making windows feel more like Linux

I haven’t posted for a while, and this is not going to be a very useful post, you have been warned… At work we use Windows. I vowed when I left my last job that I’d never go to work somewhere that uses Windows. I am very much still not a fan, especially since I have not got admin rights and we are using Windows 8 - one of the most hated versions of Windows since ME.

Emcee in Rust

I’ve been re-implementing the Python emcee library in Rust. I thought it would be a good project to tackle for a few reasons: I actively use it in my own work, and am reasonably familiar with the API and how it works it has few external dependencies and is mostly pure Python it performs cpu-limited computations which suit a compiled high performance language the Python version has parallelism to increase speed, which should be easily achievable with rust (We’ll see that for the time being the last point has been put on hold.

Python parallelism cheat sheet (part 2)

This blog post is the second in a series I am writing, covering methods of simple parallelism. The following posts cover more convenient methods, as well as some things that should be considered. Parallelism methods Basics and introduction Function objects If I’ve skipped your favourite method of parallelism, feel free to tweet me or add a comment on the tracking issue informing me. This method was brought to my attention by Tom Marsh, and is a nice alternative to using functools.

Python parallelism cheat sheet

I often get asked “how can I parallelise my Python code?". I’ve come up with this simple cheat sheet to explain it. I will only explain the most common method of parallel problems here: embarrassingly parallel problems. This blog post is the first in a series I am writing, covering methods of simple parallelism. The following posts cover more convenient methods, as well as some things that should be considered.

Installing rust on older linux systems

At work we use SLES 11 which has quite old versions of openssl and installed certificates. I was getting certificate errors trying to install rust with the rustup tool. I tried searching for any help at all but in the end I followed the following advice: download a more recent certificate bundle (e.g. from certifi or mozilla) set the environment variable SSL_CERT_FILE to point to this new file This works for both rustup and cargo meaning I can develop with rust on my work machine.

Fighting the compiler

I’m learning Rust at the moment, which I’m finding quite an interesting challenge. I agree with a lot of the Rust principles and find it extremely comforting that the compiler has got my back, but it’s bringing me back to my early times learning C and “fighting with the compiler”. How many hours did I spend adding “&” and “*” to variables to pass into functions before I really understood what it meant for a function to take a pointer?

Numpy functions may not do what you think

Numpy has the ability to mask arrays and ignore their values for certain computations, called “masked arrays”. They contain a .mask attribute which is a boolean array, True where the value should be masked and False otherwise. Numpy also comes with a suite of functions which can handle this masking naturally. Typically for a function in the np. namespace, there is a masked-array-aware version under the namespace: np.median => np.

Command line inconsistency

RTFM! Today I brought down our head node at work, because of a misunderstanding of command line arguments for a linux program. In fairness, I should have read the man page more carefully for the entry in question! I was using xargs for some nice command line parallelism and process running. The command I ran was: ls | grep action119 | grep exposureCycle | xargs -n 1 -I {} find {} -name 'IMAGE*.

Add timestamps to stdout

I spent some time trying to get timestamps added to C++ printing, e.g .through cout. I naive approach is to write a function get_current_time() and put it before all printing statements e.g.: cout << get_current_time() << "Message" << endl; This requires changing all logging statements. Then my googling stumbled upon this question which had an elegant solution incorporating a decorator object. Further down the page however I came upon a much nicer solution that transcends languages and programs and can be applied to running shell commands.