# Personal Semantic Search

I have a few thousand notes, a pile of blog posts and tweets, huge amount of sent email, and various other data, written by me. In order to get more value out of this dataset, I built a prototype of a personal search engine utilizing NLP. I set up a system that automatically connects my …

Continue reading ‘Personal Semantic Search’ »

# Predicate Logic Solving: TPTP vs SMT

Many interesting problems can be presented declaratively using predicate logic. Real life examples are scheduling, logistics, and software and electric circuit verification. That is, the problems are hard and logic provides a way to solve them declaratively. Logic solvers take as input the problem declaration and spit out the solution to the problem. Examples of …

Continue reading ‘Predicate Logic Solving: TPTP vs SMT’ »

# Bayesian Data Analysis of Capacity Factor

Stan is a platform for probabilistic programming. To demonstrate its features I did data analysis of wind energy capacity factor in Finland. Wind energy is feasible in Finland, and we have quite high seasonal variance, so modeling wind data makes an interesting case. This case study presents a Bayesian data analysis process starting from data, …

Continue reading ‘Bayesian Data Analysis of Capacity Factor’ »

# Covid-19 False Positives

Lab test false positive rates may feel counter-intuitive. Let’s take a closer look at the state-of-the-art Covid-19 real time PCR test. In Interpreting a covid-19 test result Watson & al., The BJM, May 2020 say that the sensitivity of the test is between 71–98%, and specificity around 95%. The English statistics authority estimates that in …

Continue reading ‘Covid-19 False Positives’ »

# 8 Requirements of Intelligence

What is intelligence, in the context of machine learning and AI? A classic from 1979, Hofstadter’s GEB, gives eight essential abilities for intelligence: to respond to situations very flexibly to take advantage of fortuitous circumstances to make sense out of ambiguous or contradictory messages to recognize the relative importance of different elements of a situation …

Continue reading ‘8 Requirements of Intelligence’ »

# Visualizing Neural Net Using Occlusion

I got good results using an RNN for a text categorization task. Then I tried using a 1D CNN for the same task. To my surprise, I got even better results, and the model was two magnitudes smaller. How can such a lightweight model perform so well? Out of curiosity, and also to verify the results, …

Continue reading ‘Visualizing Neural Net Using Occlusion’ »

# Why Loss and Accuracy Metrics Conflict?

A loss function is used to optimize a machine learning algorithm. An accuracy metric is used to measure the algorithm’s performance (accuracy) in an interpretable way. It goes against my intuition that these two sometimes conflict: loss is getting better while accuracy is getting worse, or vice versa. I’m working on a classification problem and once again …

Continue reading ‘Why Loss and Accuracy Metrics Conflict?’ »

# Spell Out Convolution 1D (in CNN’s)

I’m working on a text analysis problem and got slightly better results using a CNN than RNN. The CNN is also (much) faster than a recurrent neural net. I wanted to tune it further but had difficulties understanding the Conv1D on the nuts and bolts level. There are multiple great resources explaining 2D convolutions, see …

Continue reading ‘Spell Out Convolution 1D (in CNN’s)’ »

# Less Sexism in Finnish

Machine learning models are only as good as the data you use to train them. AI sexism and racist machines have made the news in 2017. Yesterday, Slava Akhmechet tweeted about his word2vec language model test. According to the tweet, the test was trained on the Google News dataset that contains one billion words from news articles …

Continue reading ‘Less Sexism in Finnish’ »

# You Should Use Deep Learning for That!

I have been studying machine learning and artificial intelligence lately. Then run into this tweet that resonated deeply. 😉 The image is a clever update of XKCD’s original obnoxious physicist.