What can Text Analytics do for your Business?

Business Analytics, Code, ML, Python
Corporates, either big or small, gather large amount of data in terms of customer feedback, product/service reviews, surveys and complaints etc. Most of the data are text based and some cases like rating. And it takes large effort to understand, categorize, and act on it. Often a Management is faced with process/performance  improvement challenges likeIs it possible to monetize the data? Is it possible to reduce the effort and get reliable insights? Is it possible to improve the performance of the staff who deal with customers and customer satisfactionIs it possible to use the data in everyday decision making?..and many moreThe answer is Yes!  That's where Cognitive  Analytics comes in, esp Text AnalyticsWhat Is Text Analytics?Text analytics is all about, making sense of text or language. It is about processing unstructured data…
Read More

Neural Network – notes

Code, ML, Python
This is not a yet another Neural Network form scratch or something that would detail on Neural Network. This is more of a notes, that I would like to recollect from the Coursera times.NNs are more of a brute force algorithm rather than a straight mathematics ones like SVM or PCA (Except PDs in Back propagation). Though again everything starts with (y=mX+C) instead NN model is a logistic model, we use (y = Wx + b), where (W) is weights vector and b bias and most importantly (h(z)) is a non-linear could be either sigmoid (g(z) = frac 1 {1+e^{-z}})  or (tanh(y)).There is a nice joke, "When you torture the data long enough, it would start confessing to everything!", that's exactly we do in NNs. We pass the data through multiple…
Read More

SVM from Scratch? Anyone?

Code, ML, Python
We have already seen about SVM here but this post is borne out of trying to understand inner workings of kernel and predict function.  While trying to work on custom kernel example in Sklearn,  found that the support_vectors_ are empty, that lead to  question, how a custom kernel is used to transform the the test points. Here is the answer!For example, an input data X of (i)x(j), when transformed by a Gaussian kernel, the new data is of  (i)x(i) matrix. We use it to train the model. Now when we want to predict some data points, say X1 ((m)x(j)), we need to use the kernel and get a matrix of (m)x(i), then how it is done?So, we need to use the original training set to transform and the equation is…
Read More

SVM, Optimization and Kernels

Code, ML
Here again, with notes, that I could get back to, when my memory fails!. This is one of those posts that would clear doubts! on optimization techniques, math behind kernels et al.SVM, uses same the basic vector inner product that we had seen already here in getting to the bottom of the PCA! (a_{1}=a.hat{b} ) where (hat{b}) is unit vector of (b)[caption id="" align="aligncenter" width="300"] Vector Inner Product[/caption]We use SVM to classify data points by drawing a plane that would separate the data. We want to draw the plane such a way that the distance between the plane and ponts is maximum, also called margin.We can see in the figure above, the best fit is where (frac 2 ||w||) is maximum.  That can be written as a minimizing problem (min ||…
Read More

PCA! Why?

ML
PCA, Principal Components Analysis , as the definition indicates, analyzing the data by reducing the number components(dimensions), that represents maximum  number of data. just like 80/20 rule. where 80% is represented by 20%FIg above, X-axis is the number of reduced components and Y-axis if fidelity(0 to 1) 1 being highest. One can see, with 100(10%) components, more than 93% of data are represented, while the original data had 1024 variables.For example an image of 1024 pixels can be reduced to 100 pixel retaining 90% of accuracy. *image curtsy  ML by Andrew NgBelow where original image 1024 vs K= 100Below where original image 1024 vs K =250What is K? We will see shortly more about K components(PCA)  that help in dimensionality reduction.So, PCA is about reducing the dimension, in another example…
Read More

K-Means vs K-Nearest Neighbors

Code, ML, Python
We have already seen K-Means, Let's see it's neighbor  K-Nearest NeighborsK-Means and K-Nearest Neighbors(KNN) use similar algorithms, their prediction and use cases are different and they fall in different category of ML, that is Unsupervised and Supervised respectively.K-Means is used for clustering, given a set of data, K-Means clusters them into K clusters.  KNN is used for Classification and Regression, that is classify a new data point into a nearest to the K data points, to say in terms of algorithm,Take the least squared distances of the data set,Sort them in ascending order  andTake top K data points, find out the majority of classes In case of Regression, take the mean of K data pointsWe can see it's different from the K-Means, though it uses same Least Squared for computing.Use CasesRecommender Systems, for…
Read More

Jump to ..

About, Code, ML, Python
Here is the list of posts that I wrote, these are more of notes that I wanted to jot down, as I have very poor memory. I'm an Engineer and I try to understand the applied maths behind these algorithms, these posts serve me as a set of concise notes as well as code snippets I don't have to google! You may find these useful.It all starts with y=mx+bAll about Logistic RegressionAll about K-MeansK-Means vs K-Nearest Neighbors PCA! Why?PCA using Python, very satisfying post! so far! SVM, Optimization and KernelsSVM from scratch! Anyone?Neural Network – notesWhat can Text Analytics do for your BusinessHow to use Predictive and Prescriptive Analytics in Decision makingMarket Profile Chart using Python!
Read More

All about K-Means

Code, ML, Python
K-Means, is probably, one of the easiest to implement, it doesn't even use any High school Math, like partials, slope etc, just Middle school Math is good enough. And it doesn't even have a training phase, I doubt, if we can, even call it a Machine Learning algorithm K-Means is a clustering algorithm, as the word means, group the data into K number of groups. It's like a Government wants to name a capital city that is equidistant from all other cities! in practice that may not be possible but what we try to do is minimize the sum of all distances from the capital city to the others. That is it, now add K number of capital cities and add closer cities to each K city, we have K number…
Read More

Market Profile Chart using Python!

Code, ML, Python
Continuing from the earlier code  where I created the profile data using 'Alphabets, here let me try to plot the chart. This one turned out bit tricky and used an example snippet from Matplotlib for stacked bar chart!Logic is simple, instead of Alphabets, have an a number for each occurrence, use Matplotlib stacked bar chart to plot the colors.mi=defaultdict(list) TGroups=df.groupby([pd.Grouper(key='DateTime', freq=frequency)]) #iterate over each group and add to dictionary, #dictonary keys are 'High' and 'Low' of each group, #rounded and values are char A, B incremented for each period(freq group) #default 30 min since we grouped based on freq, #for each group increment the char i.e +1 min_price=np.round(df.min().Low) max_price=np.round(df.max().High) for t,g in TGroups: g_min_price=np.round(g.Low.min()) g_max_price=np.round(g.High.max()) for price in range(int(min_price), int(max_price+1)): if (price > g_max_price) or (price < g_min_price): mi[price].append(0) else:…
Read More

Market Profile using Python!

Code, ML, Python
If anyone did a technical analysis using popular trading charts, would have noticed Market Profile as one of the techniques. This was a really different way of looking at the time series data.The logic is slice the data for a period, say every '30 min' and assign an alphabet, starting with 'A' to all the 'Price'. Price is also rounded of so that we don't have too many keys based on Price. Simply put in programming parlance create a dictionary with Price(rounded of) as keys and add new Alphabet for every slice.Say first 30 min, the low is 100 and high is 110, then starting with 100 to 110, append 'A' as a value. so our dictionary would be[{100, 'A'},{101, 'A'},{102, 'A'},{103, 'A'} ..{110,'A'}],then, next  slice of 30 min, the…
Read More