Solutiile de machine learning au o gama larga de domenii de aplicare

Solutiile de tip machine learning au o gama larga de aplicatii – de la scopuri bine-cunsocute, cum ar fi identificarea si filtrarea emailurilor spam, penalizarea blogurilor bazate pe spamming, astfel incat utilizatorii sa obtina rezultate relevante la cautari, recomandari de produse relevante sau combaterea fraudei online, pana la modalitati surprinzatoare prin care pot ajuta companiile sa dezvolte produse superioare, mai inteligente si intr-un fel mai rapid, cum ar fi prin identificarea utilizatorului pe baza informatiilor legate de voce, imagine sau date preluate printr-un accelerometru (felul in care un utilizator tine sau misca un dispozitiv mobil), predictii privind preturile diverselor tipuri de licitatii, evolutia preturilor actiunilor de la bursa, automatizarea controlului nivelurilor de acces ale angajatilor intr-un anumit sistem, predictii privind timpii de asteptare la unitatile de urgente ale spitalelor, identificarea gradului de risc pentru infarct miocardic, atac celebral sau episoade convulsive etc.

Pentru a exemplifica unele dintre tipurile de probleme ce pot fi solutionate apeland la machine learning, am alcatuit un set de studii de caz, dintre care majoritatea sunt rezultatul unor sesiuni interne de training si experimentare in scop didactic, de demonstrare a unor concepte:

This has been the only commercial machine learning project up to date. The client is a startup that offers software platforms to consulates, aiming to enable citizens to apply for a visa remotely, thus simplifying the process significantly.

The machine learning approach came from the need to scan the passport with a mobile device’s camera (iPad), read the data automatically and enter the recognized information (such as first and last name, birth date, country, passport number etc.) into the appropriate spaces of the visa application forms.

The client initially requested a proof of concept (PoC) of the mobile application meant to run on iPad, in order to present the idea to potential customer consulates. As the client was satisfied with the PoC, he decided to continue collaborating with Roweb on implementing the end product as well as several related ones.

The main technologies involved in implementing the mobile application for scanning passports were:

  • iOS
  • ObjectiveC, C++
  • Tesseract-one of the most popular open source optical character recognition (OCR) libraries
  • OpenCV- one of the most popular libraries for computer vision tasks

Some of the most challenging aspects that we managed to overcome during implementation were:

  • performing OCR in various light conditions, from under-exposed to over-exposed passport details

We designed and implemented a sentiment analysis system able to automatically detect the general feelings expressed in movie reviews. We were interested in the sentiment polarity of user comments, designing the system to classify movie reviews as either positive or negative.

The core component of the sentiment analysis software was the machine learning algorithm able to learn the probabilistic model of detecting the sentiment polarity in a movie review.

The learning process involved gathering a set of 50000 labeled reviews – which had an associated polarity expressed by the number of stars awarded by the user. The following technologies were used:

  • C#, C++
  • LibSVM, LibLinear – C++ libraries implementing the Support Vector Machines algorithm

The most notable challenges that we overcame during the development were:

  • gathering the training data, which consisted of 50000 labeled movie reviews

The goal of this application was to detect whether microchips from a fabrication plant met quality assurance standards.

During quality assurance tests, there were several measurements performed on each microchip. The various tests meant to ensure if the microchip was functioning correctly relied on the relationships between the measurements’ values.

The core component of the project was the classification algorithm for detecting malfunctioning microchips based on the set of measurements made on the devices.

The main technologies involved were:

  • MATLAB – the programming environment on which the software was implemented
  • Logistic Regression – the classification algorithm used for learning the probabilistic model associated to malfunction detection

The goal of this application was to detect and recognize handwritten text. The capability to automatically recognize handwriting has recently been in increasing demand, since it can be used for a wide range of purposes, from reading zip codes on mail envelopes to recognizing the amounts written on bank checks.

The core component of this project was the classification algorithm for understanding handwritten information and translating it into machine-readable representations. A corpus of 10000 labeled letters and digits was used in training the classification model.

The main technologies involved in this solution were:

  • MATLAB – the programming environment on which the software was implemented
  • Neural Networks – the classification formalism used in learning the probabilistic model of classifying handwriting and recognizing handwritten digits and letters

The purpose of this project was to develop a spam filter solution that could accurately classify emails into spam or non-spam.

The core component of the project was the spam classification algorithm. The solution involved a corpus of 4000 labeled emails used in the learning process and the following main technologies:

  • Support Vector Machines – the algorithm for training a probabilistic model to become able to classify the emails

The training and classification phase required the implementation of a series of cleaning and normalization procedures for email messages, aiming to remove non-word content, performing HTML stripping, normalizing URLs and email addresses, and applying stemming and lemmatization processes.

The purpose of this solution was the implementation of an algorithm able to detect anomalous behavior in server computers, based on a series of measurements of their functionality (e.g. throughput, latency etc.).

The core component of this project was the algorithm for detecting the malfunctioning nodes in a cluster of servers. The main technologies involved were:

  • Learning the parameters of a multivariate Gaussian distribution – the formalism behind the anomaly detection system

This solution focused on implementing a movie recommender system based on a large dataset of users, movies and user ratings for these movies. User ratings were expressed on a scale of 1 to 5, with 5 stars being the highest rating.

The core component of the project was the recommender system algorithm, able to generate recommendations tailored to each user’s tastes based on their past movie ratings and the ratings provided by the entire community. The main technologies used in implementing this solution were:

  • MATLAB – the programming environment on which the software was implemented
  • Collaborative Filtering – the algorithm capable of making automatic predictions concerning a user’s interests by collecting preference information from a group of users

Cateva proiecte ale noastre

Navigati prin portofoliul nostru dupa tipul solutiei, tehnologie principala sau domeniu de activitate al clientului.

Cere acum o estimare de cost gratuita

Trimite-ne un email la sau completeaza informatiile despre proiectul tau si noi te vom contacta cu o estimare de cost in cel mult 2 zile lucratoare.