Timothy Nguoi Yew Shan

Profile Picture

Details

Profile Info

Full Name: Timothy Yew Shan Nguoi

Student Number: D00189967

Email address: D00189967@student.dkit.ie

Current Studies: Bsc in Computing Year 4

Technical Portfolio

Roles

  • Back-end developer
  • Database
  • Networking
  • TacTalk Text-to-Event Parsing Engine Programmer 

 Language and Frameworks

  • MongoDB
  • Node JS
  • Android Java

 

Overview

For this group project we have chosen the voice command tactical annotation mobile application built for sports team. The purpose of this mobile app is to record team performance in real time, via voice command by the user, the performance are then converted from voice command to statistics and sent to our cloud server, allowing the user to refine and analyze it later.

Text-to-Event Parsing Engine

One of my chief responsibility in this project is the development of the Text-to-Event Parsing Engine, this component is located in our Heroku server, and is responsible for converting user inputs that are converted from their voice recording to text, from text to data known as events, which we can store onto our MongoDB database.

The parsing engine collects a queue of unprocessed user input that is in text format, sort them in order and append one word at a time, each time the parser will scan for keywords related to the game such as "kick pass", "goal" or "point", if a keyword is detected, the parsing engine returns the numeric values associated with that input and stores it in the event data structure, which is then stored onto the database.

Language processing modules are an addition to the keyword scanning process, which instead of scanning for keywords that has 100% match, uses a score based system to determine text similarity. There are also keyword mistakes that commonly comes up from the voice to text result that the language processing module cannot verify, I handle this by including an array of "dictionary", a collection of common mistakes that are associated with the keyword in the scanning process, that can also be used to scan for keywords.

 

Stats Compiler

The development of Stats Compiler is another one of my main responsibility in this project, it is also a component on our Heroku server. The component is responsible for getting stored game events from the database, count and compile them into a list of statistics which the sports analysis can view and process.

Most of the event type and outcome type are associated with a "stats type", which indicate which type of statistic does the numeric value count towards to, which is counted in the Stats Compiler. For example, the numeric value representation of "Hand pass" and "Kick pass" are both associated with the stats type "pass", therefore they will be counted towards the total passes made during the match for the team.

The statistics that are counted and calculated are as follows:

  1. Passes: Total passes made in a team
  2. Shot: Total attempted shots made in a team
  3. Goal: Total goal acquired in a team
  4. Point: Total point acquired in a team
  5. Turnover: A turnover happens when the opposite team took the ball away
  6. Kickout: A kickout is when a team kicks the ball from the gate.
  7. Wide: A wide is when a team kicks the ball out of the field
  8. Pass Completion: Total successful passes of a team
  9. Poessession: Total team passes per total passes occur in a match
  10. Shot Conversion: Rate of shots resulted in a point or goal
  11. Zones with most shots: Area in the field which resulted in shots the most
  12. Zones with most kickout: Area in the field which resulted in kickouts the most
  13. Kickouts won: Kickouts that did not ended in a turnover

Create Game API Demo

Download tactalk_api_create_demo.mp4 [15.83MB]
Details

Voice to text cloud function API Demo

Details

Text to Event Demo

Details

Project Structure Image

Details

Project Structure

Our project is consist of 5 parts -

  • The MongoDB database
  • The Node JS server
  • The Mobile Application
  • The Website
  • The Google Cloud Service

The MongoDB database will be responsible for storing user information as well as other data required by both the mobile application and the website to function. The reason we choose MongoDB for our database is due to the ease and speed to set up a database immediately on MongoDB Atlas, which allows us to focus on other aspect of the system. MongoDB Atlas also provides free spaces for developers, we are considering options for upgrading our database in the future should we expand our services.

The Node JS server handles API call between the database and both the mobile application and the website, Node JS is selected due to it's speed and availability of many useful node modules, one of such is the Express JS library which we are currently using to facilitate API calls. The server is also responsible for parsing the user input into JSON documents which can later be uploaded to the database. The server is hosted on Heroku due to the ease of set up, saving us time which use on developing our application instead, the service also offers real time deployment when connecting to Github, allow us to immediately see the changes once we upload our updates to Github. The service is also free for developers with limited space and computing power, but it is more than what we need.

The Mobile application is the android app that our users will be using to register a game, start the game and provide voice input into the app to start recording the game events, it will be communicating with the node JS server via REST API to receive and send data.

The Website is our front-end for users to login using their account, and view, refine and analyze their tallied game events for analysist.

Technical Problems

During the development process, I have encountered a number of technical problems that the project need to overcome.

1. Network and Database

The statistic recorded by the user must be sent to a cloud server, we came up with a demo framework that connects the mobile application to a node js server via REST API, while the node js server is connected to a mongoDB database hosted on MongoDB Atlas. Pricing of the server and database will be our main cost of maintaining the application long term, which we will need to take into consideration when deciding the business pricing model for our users.

Solution 

We've resolved this problem by adapting Express JS node modules for our server, which streamlines the development process for our API endpoints on the server. 

We've decided to host our server on Heroku, because it's free quota was more than enough for us to develop the server, it also supports connecting our accounts from Github, allowing us to quickly deploy the server directly from Github.

For our voice to text services, we used Google Cloud Text to Speech after considering it's pricings and accuracy compared to similar service offered by IBM and AWS. Additional information on the audio clip is appended on the filename which a cloud function can parse and send to API endpoints along with the voice to text result for further processing.  

2. Command Parsing

After the input from user is converted from audio to text, a parsing function will be used to generate JSON documents that records the game events in a way that can be tally and analyze. This JSON documents containing numeric representations of multiple aspects of the events - the type of event, the general position where the event is occurred, the outcome of the event, the time stamp on when the event is happening, the team and player id. By associating certain key words in the input text with the relevant event, position and outcome, the function checks for the key words and generates the JSON document which is then uploaded to our MongoDB database. Multiple different key words can be associated with an event, position or outcome to improve the flexibility of the parsing function, and the idea is that the more key words we prepared, the more accurate the command parsing function will be.

Solution

To increase the overall accuracy, we have made several changes to the command parser.

We have adapted two natural language processing NPM modules - natural and graphic-smith-waterman, which utilizes the Jaro-Winkler algorithm and Smith-Waterman algorithm respectively, comparing the similarity of keywords by returning a score, this allow us to compare keywords even when they are not exactly the same, as the voice to text results are prone to output similar words.

We have also changed how player number is extracted from user input to simplify the process, instead of saying "shot goal player 14", now the command parser can recognize the player number by saying "shot goal 14". We removed the player keyword because we find the error rate of "player" being mistakenly translate into something else by Google Cloud Voice to Text to be too frequent, while it has a high accuracy when converting numbers from voice to text. This also reduced the number of keywords a user have to say during a match, which often can be fast paced.

We have also implement functions to allow the Command Parser engine to interpret which team has the possession. For example, when "blue team possession" is spoken, the command parser will assume all subsequent actions are performed by members of Blue team, until keyword such as "green team possession" or "turnover" is spoken. Furthermore, when "turnover" is spoken, the command parser will automatically interpret it as the opposite team has the ball, for example, from Blue team to Green team. Having the command parser to automatically interpret team possession also means that the user does not have to constantly state which team is doing the action, but only need to mention the change of possession.

 

 

Reflection

During this project, we also have made a few mistakes in the course of the development and release of TacTalk.

Lacking early tech demo

We weren't able to provide a demo of the voice to text technology during the early phase of the project, we realized that providing a demo, albeit imperfect is still vital to garner the confidence from our stakeholders, in this case our lecturers. We realized this mistake and decided to contact one of our lecturer outside of class to showcase our demo. 

In the future when working on new project, we should try to prepare a demo for the key pieces of technology required for the project, this way both us and the stakeholders will have at least an idea of what the end project is going to look like.

Poor communication with the Stakeholders

Due to the nature of TacTalk, and because some of the group members including me, are not well verse in Gaelic football, we decided to consult personnel from the sport, we felt that we need to tailor this application to the need of professional sports analysis. This causes us to ignored the communications from our lecturers who focuses more on human elements, user quality of life and testing.

We reflected on this vital mistake, as we realized that we are not just solely developing an application for sports analysis, but also an application that meets the standard to our other stakeholders, our lecturers. We then shifted our development direction to focus more on testing and user experiences, and try to balance the need from both group of our stakeholders.

Remote team working

During our project we are also going through a global pandemic, we are unable to meet in person at the campus, our project has to facilitated through the use of Microsoft Teams and Zoom. We also faces challenges on scheduling our team meetings, as some of the team members are in different time zones other than Ireland.

Handling burnout and fatigue is also another issue we did not accounted for, we underestimated the length of the lockdown period, I have experience low productivity during this time period. Constant communication is needed among the team members to counter this problem and increase cohesion.