I’m capable of developing, testing and deploying quantitative Python models which are built for predicting dynamic environment like stock market, electricity prices, sport betting and any other processes of stochastic nature.

My typical job in this field is to take a bulk of raw data, and pull the insights from it, allowing to make better decisions for my clients.


Typically, I’m focused on time-series data which could be easily enough quantified, like financial series, sales series, price, and volume series of goods and services. However, the simple data doesn’t mean simple analysis, my key strength is getting as much information from raw metrics as possible.

After initial data processing, I’m able to build a quantitative algorithm or machine-learning model for achieving project goals, whatever they may be. Typically, it’s all about increasing profit efficiency, raising mathematical expectation or probability of success.

Also, I’m specialized in building models on multidimensional data, for example options chains, stock portfolio rebalancing, or even horse racing. In other words, the data when we need to combine and compare multiple data sources and make decisions on relative basis.


  • Financial modelling. Profit prediction models, backtesting models, trading strategies, portfolio management.
  • Risk modelling. VAR modelling, Monte-Carlo simulations, data bootstrapping.
  • Machine learning. Feature selection, feature importance research, Random Forests, XGBoost.
  • Econometric models. Decision trees, logistic regression, OLS models, GARCH models for volatility.
  • Model optimization. Genetic optimization algorithms, gradient boosting, CMAE, PSO.
  • Data scraping. Website parsers, binary format converters, SQL to CSV converters, general data set building with data cleanup and feature preparation, REST API scraping.

Tool set

Primarily, I use Python for data analysis, because it has the best libraries for data analysis and the one of the most efficient languages for making quick research.

My technological stack looks like the following:

  • Python – as general purpose language
  • Cython / Numba – for high performance calculations for large amount of data
  • MongoDB – for data storage
  • Pandas / Numpy – for number crunching
  • SciPy / Scikit-learn – for machine learning models
  • Jupyter Notebook / Plotly Dash – for data visualization / dashboarding

Open-source and sample projects

Data science portfolio – a very rough example how my analysis notebook may look like

yauber-algo – open-source collection of standalone algorithms for financial time series analysis.

yauber-backtester – open-source stock / crypto / futures backtester with portfolio management support

cython-tools – a project for high-productive development in Cython (debugger, profiler, coverage, unit tests)