Air quality prediction using machine learning

The use of statistical strategies to estimate or predict the conduct of a phenomenon sooner or later has been frequent apply in lots of disciplines similar to well being care, buying and selling, auto insurance coverage and buyer relationship administration. The aim of those strategies shouldn’t be essentially the prediction of the precise final result, however a willpower of the probability of various outcomes, since in that case one can put together accordingly. Machine learning has additional empowered such strategies, permitting them to grow to be much more correct by using extra knowledge and extra computation.
Still, having access to such knowledge could be problematic in numerous circumstances for 3 foremost causes:

Volume: it may be very costly to switch such info since that may be taxing on community assets.
Privacy: the knowledge that’s collected could be delicate privacy-wise; no matter course of has entry to such knowledge is uncovered to particular particulars belonging to totally different people.
Legislation: in sure nations, knowledge in regards to the constituents of that nation shouldn’t be allowed to be moved outdoors the nation for authorized causes.

Predictive fashions require numerous knowledge, which could be problematic since storing such knowledge is pricey and transferring such knowledge can create a major load on networks.
What if there was a way that may permit for the coaching of predictive fashions with out the necessity to switch wanted knowledge in its unique uncooked kind.
To sort out this problem, we collaborated with Uppsala University in Sweden, within the scope of a Computer Science undertaking course that’s a part of their curriculum. Student initiatives are an everyday a part of our analysis work, which permit us to discover particular expertise areas whereas additionally nurturing vital relations with academia.
Air quality prediction
This 12 months, we determined to set our sights on air quality prediction. Obviously, poor air quality has extreme results on folks’s well being. If we are able to predict air quality, we are able to alter our conduct on totally different ranges starting from particular person conduct to communities, to nations and even globally. One such instance is Beijing the place within the case of poor air quality, coal factories are suspending their operation.
Predicting air quality is a difficult downside since air quality can differ considerably from one place to a different – from quiet residential areas and parks to busy streets and industrial areas. We additionally want to think about atmospheric patterns similar to rain, air strain, temperature, and so forth., which have an effect on the amount of every pollutant within the air. The knowledge collected can be utilized past finding out air quality; we are able to aspire in direction of predictive strategies that may assist us proactively decide totally different measures that we are able to take that might enhance air quality and/or protect delicate teams from its results.

Putting predictive fashions to the check
The aim of the undertaking was to depart from using centralized knowledge – giant aggregations of information from a number of air quality monitoring stations. That is the standard method for coaching supervised machine learning fashions, nevertheless it requires the switch and aggregation of huge volumes of uncooked knowledge.
Instead, the scholars investigated federated learning which allows a machine learning mannequin to be skilled at every station and thereafter mix such fashions using federated averaging.

Our article Privacy-aware machine learning with low community footprint describes the advantages of federated learning in telecommunications. Since solely the parameters of the predictive fashions are transferred, this could lower the amount of site visitors within the community.
In the scope of this undertaking, we envisioned a decentralized setup consisting of a number of air quality stations the place every station would accumulate knowledge for a selected space, have compute capabilities that may allow it to coach a predictive mannequin using regionally collected knowledge and talk with different air quality stations.
Since such a setup doesn’t but exist, the scholars simulated it by wanting into measurements collected by the Swedish Meteorological and Hydrological Institute (SMHI). Even although that was a centralized dataset, the scholars divided it per climate station (Stockholm E4/E20 Lilla Essingen, Stockholm Sveavägen 59, Stockholm Hornsgatan 108, and Stockholm Torkel Knutssonsgatan) and skilled 4 particular person fashions which have been later aggregated by way of federated averaging.
Validating outcomes all the time requires a baseline for comparability. In this case, a excessive performing centralized mannequin was developed to validate towards the federated fashions. Each pupil labored on the identical practice/check/validation dataset however explored it in numerous methods, using totally different options and totally different machine learning mannequin architectures. Testing such fashions in parallel and evaluating them based mostly on their accuracy – Symmetric Mean Absolute Percentage Error (SMAPE) and Mean Absolute Error (MAE) to be precise – enabled the scholars to cowl a big space of various settings and converge to a excessive performing centralized mannequin.
10 enter options have been used as enter to the machine learning mannequin which was skilled by the scholars:

Different fashions have been carried out together with Long Short-Term Memory Networks (LSTM) and Deep Neural Networks (DNNs) to foretell the following 1, 6, and 24 hours.
In the centralized case, the fashions that aimed to foretell the following hour carried out on common higher than people who geared toward predicting the following day. Scores ranged from 0.282 to 0.5214 SMAPE and 0.22 to 0.47 MAE.
On the federated mannequin facet, related MAE scores have been noticed, which reveals that the decentralized setup we initially envisioned may very well be supported by decentralized coaching strategies similar to federated learning.
If you need to dig into particulars, try the undertaking stories from the 2 undertaking teams. You can discover all concerning the implementation on our Decentralized Air quality Monitoring and Prediction GitHub.
Round off
Ericsson Research and Uppsala University have an extended historical past of collaboration inside the scope of the Computer Science undertaking course, which is mainly the place we, the trade, will get to pitch a difficult downside to a bunch of courageous college students.  Within this premise, our steerage mixed with the supervision offered by the college allows the scholars to self-organize and sort out this problem. This often entails embracing the SCRUM means of working, placing plenty of effort on the event of a working prototype, letting off some steam at a social occasion and eventually sharing the codebase and the undertaking report back to make this work out there to others.

But this 12 months was totally different. As a results of the COVID-19 pandemic, your complete course wanted to be dealt with remotely. This posed an fascinating problem for the reason that college students didn’t get to be in the identical room and revel in working facet by facet on this activity and attending to know one another whereas whiteboarding. Instead, they wanted to do all of it remotely, together with the social occasion. To deal with this productively, the instructing assistants carried out a small competitors – a mini model of Kaggle, which enabled the scholars to compete by attempting out totally different hypermeters when tuning their predictive fashions.
Towards the creation of a extra sustainable planet, it’s nice to see how strategies similar to federated learning can contribute, not solely by simplifying and enhancing the method of coaching a machine learning mannequin and its general lifecycle administration, but additionally the way it can enhance the quality of individuals’s lives via air quality prediction. We anticipate extra purposes of federated learning and different strategies contributing in direction of that aim.
Learn extra
Read undertaking report from group 1
Read the undertaking report from group 2
Learn particulars of the implementation on our Decentralized Air quality Monitoring and Prediction GitHub.
Explore how ICT together with AI assist pioneer a sustainable future.

Recommended For You