Deep Learning Applied in Real-World Applications


Accounting : Financial Close

Deep Learning Applied in Real-World Applications

Choose Your Hands-on Learning Topic

Disclaimer

All software & hardware's used or referenced in this guide belong to their respective vendors. We have developed this guide based on our development infrastructure and this guide may or may not work on others systems and technical infrastructure. We are not liable for any direct or indirect problems caused to the users using this guide.
Executive Summary

The purpose of this document is to provide adequate information to users to implement an Artificial Neural Network model. In order to achieve this, we are using one of the most common financial problem that occurs in every company. The problem is solved using Artificial Neural Network, a deep learning model.
Business Problem

Problem Statement

Identify whether the given transaction is an intercompany transaction or not.

Business Challenges

  • Takes more time
  • Different data sources, data formats and data frequency
  • Human intensive

Business Context

It is a legal obligation for “Company A” to disclose their financials to the internal and/or external stakeholders of their company. During this process “Company A” collects financial data from all of its subsidiaries and then tries to find out intercompany transactions (ex: transactions that took place between a parent company and its subsidiary since these are not real financial transactions).

Traditionally, this process has been performed by “Company A” using various systems, several data sets and a group of accounting experts, and it has been a weeklong task.

“Company A” decided to use artificial intelligence to automate this process, in order to increase productivity and reduce the time taken to complete this task.
High Level Implementation Steps

Step 1:  Define a Clear Problem Statement and Problem Type .

Step 2: Data Engineering - Data Collection, Data Preparation and Data Provisioning.

Step 3: Feature Engineering - Feature engineering is the process of applying domain knowledge of the data to create features that make machine learning algorithms work

Step 4: Model Selection - Model selection is the process of choosing between different machine learning approaches.

Step 5: Model Implementation

  • Import the Required Libraries.
  • Import the Training Data.
  • Extract the Features & Labels.
  • Convert Categorical Values to Numerical Values using Label Encoder.
  • Train the Model.
  • Review the Learning Algorithm.
  • Import Test Data.
  • Run the model on Test Data.
  • Review the model outcome & write the model Outcome to a file. Open the file to Review the Outcome.
Model Selection

Model selection is the process of choosing between different deep learning approaches - e.g. ANN, CNN, RNN etc. - or choosing between different hyperparameters or sets of features for the same deep learning approach.

The choice of the actual deep learning algorithm (e.g. ANN or CNN) is less important than you'd think - there may be a "best" algorithm for a particular problem, but often its performance is not much better than other well-performing approaches for that problem.

There may be certain qualities you look for in a model:

  • Interpretable - can we see or understand why the model is making the decisions it makes?
  • Simple - easy to explain and understand
  • Accurate
  • Fast (to train and test)
  • Scalable (it can be applied to a large dataset)

Our Problem here is a Supervised Classification Problem. The Problem is to Identify whether the given transaction is an intercompany transaction or not. This Type of Problem can be Solved by the following Models.

  • Artificial Neural Networks
  • Convolutional Neural Networks
  • Recurrent Neural Networks
We are going to use Artificial Neural Networks as the Best fit Model for our Problem Statement as the Problem we are solving is a Binary (Two Classes) Classification Problem. Artificial Neural Networks is found to be the Best Model for Binary Classification Techniques in terms of Accuracy & Performance.
Feature Engineering

  • Feature Engineering is the process of using industry and business knowledge of the data to create features that make deep learning algorithms work. If feature engineering is done correctly, then it increases the predictive power of deep learning algorithms by creating features from raw data that help facilitate the learning process. Feature Engineering is an art.
  • Feature engineering is the most important step in machine learning that creates a huge difference between a good model and a bad model. In Deep Learning there is no feature selection required to be done manually instead the deep learning models themselves perform feature selection automatically. This is a one the advantage that deep learning models offer.

Advantages of Feature Engineering:

  • Good features provide you with the flexibility of choosing an algorithm; even if you choose a less complex model, you get good accuracy.
  • If you choose good features, then even simple ML algorithms do well.
  • Better features will lead you to better accuracy. You should spend more time on features engineering to generate the appropriate features for your dataset. If you derive the best and appropriate features, you have won most of the battle.
Data Management

There are three types of datasets that are used at various stages of this implementation: Training, Test and Development. Training dataset is the largest of the three, while test data functions as the seal of approval and you don’t need to use it until the end of the development. Sometimes, the test and development data set can be the same.
What is a Training Data Set?

The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN etc.). This is the actual data with which the models learn with various API and algorithms to train the machine to work automatically as a learning algorithm.

The following section describes the training data and its associated characteristics. These characteristics are:

Training Data Set

  • Company
  • Company Country
  • Posting Date
  • Document Date
  • Currency
  • Trading Partner
  • Trading Partner Country
  • Transaction Type
  • Data Source
  • Data Category
  • Account
  • Transaction Amount
  • Inter Company Transaction Flag
What is a Test Data Set?

Test data set helps you validate that the training has happened efficiently in terms of either accuracy, or precision and so on. Actually, such data is used for testing the model, whether it is responding, working appropriately, or not.

Test Data Set

The following section describes the features that’s used in the model for predictions.

  • Company Code
  • Trading Partner
  • Trading Partner Country
  • Transaction Type
  • Data Category
What is Learning Algorithm?

  • A self-learning (not a human developed code) code, performs data analysis and extracts patterns (business characteristics) in data for business application development - a modern approach to application/software development.
  • Automatically understands and extracts data pattern a modern approach (change in business circumstance) and performs data analysis based on the new/changed data - No code change required to implement changes that took place in the data (change in business)
Deep Learning Libraries Used

There are several machine and data engineering libraries available. We are using the following libraries, and these libraries and their associated functions are readily available to use in Python to develop business application.

  • Tensorflow 1.10.0
  • Keras 2.2.2
  • Pandas 0.20.3

Classifier/Model Used

As we explained above, we are using supervised machine learning model Artificial Neural Networks (ANN).

Artificial Neural Networks

Model Building Blocks: There are several technical and functional components involved in implementing this model. Here are the key building blocks to implement the model.

Model Building Blocks

Model Building Implementation Steps

A model implementation, to address a given problem involves several steps. Here are the key steps that are involved to implement a model. You can customize these steps as needed and we developed these steps for learning purpose only.

Model Building Implementation Steps
Model Implementation Code Block

  • # Step 1- Import the Required Libraries
    import pandas as vAR_pd
    from sklearn.preprocessing import LabelEncoder
    #import numpy as np
    from keras.models import Sequential
    from keras.layers import Dense, Dropout
    from keras.optimizers import Adam
  • #Step 2- Import Training Data
    vAR_df = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Input_Data)
  • # Step 3 - Convert Categorical Data into Numerical Values using Label Encoder
    vAR_le = LabelEncoder()
    vAR_Transaction_Type_Conversion = vAR_le.fit_transform(vAR_df.iloc[:,7])
    vAR_Transaction_Type_Conversion_df = vAR_pd.DataFrame(vAR_Transaction_Type_Conversion,columns={'Transaction_Type_Converted'})
    vAR_pd.DataFrame(vAR_Transaction_Type_Conversion,columns={'Transaction_Type_Converted'})
    vAR_Data_Category_Conversion = vAR_le.fit_transform(vAR_df.iloc[:,9])
    vAR_Data_Category_Conversion_df = vAR_pd.DataFrame(vAR_Data_Category_Conversion,columns={'Data_Category_Converted'})
    vAR_pd.DataFrame(vAR_Data_Category_Conversion,columns={'Data_Category_Converted'})
    # Attached the Converted Numerical Data to the main dataframe
    vAR_df1 = vAR_df.merge(vAR_Transaction_Type_Conversion_df,left_index=True, right_index=True)
    vAR_df2 = vAR_df1.merge(vAR_Data_Category_Conversion_df,left_index=True, right_index=True
  • # Step 4 - Train the Model
    vAR_Features_train = vAR_pd.read_excel(vAR_Fetched_Data_Train_All_Features)
    vAR_Label_train = vAR_df.iloc[:,12]
    vAR_model = Sequential()
    vAR_model.add(Dense(4, input_shape=(4,), activation='relu',name='Input_Layer'))
    vAR_model.add(Dense(10, activation='relu',name='Hidden_Layer'))
    vAR_model.add(Dropout(0.5)) # Adding Dropout Prevents Overfitting
    vAR_model.add(Dense(1, activation='softmax', name='Output_Layer'))
    # Adam optimizer with learning rate of 0.005
    vAR_optimizer = Adam(lr=0.005)
    vAR_model.compile(vAR_optimizer, loss='binary_crossentropy', metrics=['accuracy'])
    vAR_model.fit(vAR_Features_train, vAR_Label_train, verbose=1, batch_size=5, epochs=200)
  • # Step 5 - Review Learning Algorithm
    vAR_model.predict(vAR_Features_train)
  • # Step 6 - Import Test Data
    vAR_df3 = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Test_Data)
    vAR_Transaction_Type_Conversion_test = vAR_le.fit_transform(vAR_df3.iloc[:,3])
    vAR_Transaction_Type_Conversion_test_df = vAR_pd.DataFrame(vAR_Transaction_Type_Conversion_test,columns={'Transaction_Type_Converted'})
    vAR_Data_Category_Conversion_test = vAR_le.fit_transform(vAR_df.iloc[:,4])
    vAR_Data_Category_Conversion_test_df = vAR_pd.DataFrame(vAR_Data_Category_Conversion_test,columns={'Data_Category_Converted'})
    vAR_df4 = vAR_df3.merge(vAR_Transaction_Type_Conversion_test_df,left_index=True, right_index=True)
    vAR_df5 = vAR_df4.merge(vAR_Data_Category_Conversion_test_df,left_index=True, right_index=True)
    vAR_Features_test = vAR_pd.read_excel(vAR_Fetched_Data_Test_All_Features)
  • # Step 7 - Running Model on Test Data
    vAR_Labels_Pred = model.predict(vAR_Features_test)
  • # Step 8 - Review Model Outcome
    vAR_Labels_Pred = vAR_pd.DataFrame(vAR_Labels_Pred,columns={'Predicted_Inter_Transaction_Type'})
    vAR_Features_test = vAR_Features_test.sort()
  • # Step 9 - Write Model Outcome to File
    vAR_df6 = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Test_Data)
    vAR_df6 = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Test_Data)
    vAR_df7 = vAR_df6.merge(vAR_Labels_Pred,left_index=True, right_index=True)
    vAR_df8 = vAR_df7.to_excel(vAR_Fetched_Data_Model_Path, engine='xlsxwriter')
  • # Step 10 - To open and view the file outcome
    vAR_df9 = vAR_pd.read_excel(vAR_Fetched_Data_Model_Path)
    vAR_df9
Model Implementation Steps

Step 0 - Open Jupyter Notebook

Jupiter notebook is launched through the command prompt. Type cmd & Search to Open Command prompt Terminal.

Model Implementation Steps

Now, Type Jupiter notebook & press Enter as shown

Model Implementation Steps

After typing, the Below Page opens

Model Implementation Steps

Open a New File or New Program in Jupyter Notebook

To Open a New File, follow the Below Instructions

Go to New >>> Python [conda root]

Model Implementation Steps

Give a meaningful name to the File as shown below.

Model Implementation Steps

Model Implementation Steps
Step 1- Import Required Libraries Used

For our Model Implementation we need the following two libraries:

Tensorflow: TensorFlow is an open-source machine learning library for research and production. TensorFlow offers APIs for beginners and experts to develop for desktop, mobile, web, and cloud.

Keras: Keras is an open source neural network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit or Theano.

PandasPandas is a library used for data manipulation and analysis. For Our Implementation we are using it for Importing the Data file & Creating Data frames (Stores the Data).

Import Required Libraries Used
Step 2 - Import the Training Data

Next immediate step after importing all libraries is getting the Training data imported. We are importing the Training data stored in our local system with the use of Pandas library.

Import the Training Data
Step 3 - Feature & Label Selection

Step 3 of the Implementation is Feature Selection. Deep learning works on a simple rule – if you put garbage in, you will only get garbage to come out. By garbage here, we mean noise in data. This becomes even more important when the number of features are very large. We need only those features (Input) that are function of the Labels (Outputs). Ex: To Predict whether the given fruit is an apple or an orange, Color/Texture of the Fruit becomes a feature to be Considered. If the Color/Texture is Red then it is an Apple, If it is Orange, it is an Orange.

The Features Selected Must be Numerical. If not, they have to be Converted to numerical values from categorical values. In our Scenario we use Label Encoder for the Conversion.

The Features Selected are Company, Trading Partner, Transaction Type, & Data Category. The Label is the Target Variable i.e. the Inter Company Transaction.

Feature & Label Selection
Step 4 - Training the Model

Step 4 is Training the Model, meaning Making the model Learn, understand & recognize the Pattern in the data. Before the Model. Train is for the Model Input & the Test is for Testing the Model.

We use a Binary Classification Model for our Classification Problem.

Model.fit (Features_train, Labels_train)

Training the Model
Step 5 - Review the Learning Algorithm

As a next step we need to Review the Algorithm as to how it has learned from the Features we Provided as shown

Review the Learning Algorithm
Step 6 - Import Test Data

Import the Test Data, this is the data used to test as to how the Model Performs

Import Test Data
Step 7 - Running the Model on Test Data

Next, we Test the Model with the Test Data as shown

Running the Model on Test Data
Step 8 - Review the Model Outcome

Next, We Review the Output of the Model, i.e. the Prediction it has made on the test data

Review the Model Outcome
Step 9 - Write the Model to a File

Next, We Write the Model Output to an excel file for analysis.

Write the Model to a File
Step 10 - Open the File to View the Outcome

Open the Written File & Check the Outcome as Shown. Execute to View the data

Open the File to View the Outcome

Open the File to View the Outcome
Conclusion

In this lab work, we used Artificial Neural Networks, a Deep learning model to predict whether the given transaction is an intercompany transaction or not. The model performed well on the test data & predicted the outcome as expected. For further data analysis and business decision the model outcome is in a persistent - File.

This is a very basic implementation to learn and better understand the overall steps and processes that are involved in implementing a deep learning model. There are a lot more steps, processes, data and technologies involved. We strongly request and recommend you learn more and prepare yourself to address real-world problems.

Appendix

Model Fitting in Machine Learning

Fitting is a measure of how well a Machine learning model generalizes to similar data to that on which it was trained. A model that is well-fitted produces more accurate outcomes, a model that is overfitted matches the data too closely, and a model that is underfitted doesn’t match closely enough. Fitting is the essence of machine learning. If your model doesn’t fit your data correctly, the outcomes it produces will not be accurate enough to be useful for practical decision-making.

Types of Fitting:

  • Regular Fitting
  • Over Fitting
  • Over Fitting

Best Fitting: The model is Best Fitting, when it performs well on training example & also performs well on unseen data. Ideally, the case when the model makes the predictions with 0 error, is said to have a best fit on the data. This situation is achievable at a spot between overfitting and underfitting. In order to understand it we will have to look at the performance of our model with the passage of time, while it is learning from training dataset.

Training Data Set (Example-1): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-1): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Best Fitting Model Code Block (Example-1)

if vAR_Fetched_Data_Model_Fitting_Best_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

import pandas as vAR_pd

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Best_Fit_File_Example_1,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='rgb')

#print(vAR_df11)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Best_Fit_Image_Example_1)     

Best Fitting Model Plotted (Example-1)

Best Fitting Model Plotted

Training Data Set (Example-2): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-2): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Best Fitting Model Code Block (Example-2)

if vAR_Fetched_Data_Model_Fitting_Best_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Best_Fit_File_Example_2,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='bky')

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Best_Fit_Image_Example_2)     

Best Fitting Model Plotted (Example-2 )

Best Fitting Model Plotted

Training Data Set (Example-3): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically.

Training Data Set

Test Data Set (Example-3): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Best Fitting Model Code Block (Example-3)

if vAR_Fetched_Data_Model_Fitting_Best_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Best_Fit_File_Example_3,'r',encoding ='utf-8'))   

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100,c='gcg')

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])    

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Best_Fit_Image_Example_3)

Best Fitting Model Plotted (Example-3)

Best Fitting Model Plotted
  Over Fitting

The model is Overfitting, when it performs well on training example but does not perform well on unseen data. It is often a result of an excessively complex model. It happens because the model is memorizing the relationship between the input example (often called X) and target variable (often called y) or, so unable to generalize the data well. Overfitting model predicts the target in the training data set very accurately.

Training Data Set (Example-1): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-1): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Over Fitting Model Code Block (Example-1)

if vAR_Fetched_Data_Model_Fitting_Over_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Over_Fit_File_Example_1,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='rgb')

#print(vAR_df11)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Over_Fit_Image_Example_1)

Over Fitting Model Plotted (Example-1)

Over Fitting Model Plotted

Training Data Set (Example-2): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-2): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Over Fitting Model Code Block (Example-2)

if vAR_Fetched_Data_Model_Fitting_Over_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Over_Fit_File_Example_2,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='rgb')

#print(vAR_df11)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Over_Fit_Image_Example_2)

Over Fitting Model Plotted (Example-2)

Over Fitting Model Plotted

Training Data Set (Example-3): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-3): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Over Fitting Model Code Block (Example-3)

if vAR_Fetched_Data_Model_Fitting_Over_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Over_Fit_File_Example_3,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='rgb')

#print(vAR_df11)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Over_Fit_Image_Example_3)     

Over Fitting Model Plotted (Example-3)

Over Fitting Model Plotted
 Under Fitting

The predictive model is said to be Underfittingif it performs poorly on training data. This happens because the model is unable to capture the relationship between the input example and the target variable. It could be because the model is too simple i.e. input features are not expressive enough to describe the target variable well. Underfitting model does not predict the targets in the training data sets very accurately. Underfitting can be avoided by using more data and also reducing the features by feature selection.

Training Data Set (Example-1): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically.

Training Data Set

Test Data Set (Example-1): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Under Fitting Model Code Block (Example-1)

if vAR_Fetched_Data_Model_Fitting_Under_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Under_Fit_File_Example_1,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='rgb')

#print(vAR_df11)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Under_Fit_Image_Example_1)

Under Fitting Model Plotted (Example-1)

Under Fitting Model Plotted

Training Data Set (Example-2): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically.

Training Data Set

Test Data Set (Example-2): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Under Fitting Model Code Block (Example-2)

if vAR_Fetched_Data_Model_Fitting_Under_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Under_Fit_File_Example_2,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='ycr')

#print(vAR_df11)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Under_Fit_Image_Example_2)

Under Fitting Model Plotted (Example-2)

Under Fitting Model Plotted

Training Data Set (Example-3): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically.

Training Data Set

Test Data Set (Example-3): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Under Fitting Model Code Block (Example-3)

if vAR_Fetched_Data_Model_Fitting_Under_Fit_Test =='Y':

import matplotlib.pyplot as vAR_plt

vAR_df8 = vAR_pd.read_csv(open(vAR_Fetched_Data_Under_Fit_File_Example_3,'r',encoding ='utf-8'))

vAR_df9 = vAR_df8.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_plt.scatter(vAR_df.iloc[:,0],vAR_df.iloc[:,12],s=100, c='gbc')

#print(vAR_df11)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5])   

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted_Inter_Transaction')

#plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Under_Fit_Image_Example_3)

Under Fitting Model Plotted (Example-3)

Under Fitting Model Plotted
Cross Validation

Cross-validation is a technique in which we train our model using the subset of the data-set and then evaluate using the complementary subset of the data-set.

Training Data Set (Example-1): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-1): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

 

Cross Validation Model Code Block (Example-1)

if vAR_Fetched_Data_Cross_Validation_Required =='Y':

#from sklearn import datasets

from sklearn.model_selection import cross_val_predict

from sklearn.linear_model import LogisticRegression

import matplotlib.pyplot as vAR_plt

vAR_model = LogisticRegression()

vAR_Predicted = cross_val_predict(vAR_model, vAR_Features_train, vAR_Label_train , cv=2)

vAR_fig, vAR_ax = vAR_plt.subplots()

vAR_ax.scatter(vAR_Label_train, vAR_Predicted, edgecolors=(0, 0, 0))

vAR_ax.plot([vAR_Label_train.min(), vAR_Label_train.max()], [vAR_Label_train.min(), vAR_Label_train.max()], 'k--', lw=4)

vAR_ax.set_xlabel('Actual Intercompany Transaction')

vAR_ax.set_ylabel('Predicted Intercompany Transaction')

## plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Cross_Validation_Image_Example_1)

Cross Validation Model Plotted (Example-1)

Cross Validation Model Plotted

Training Data Set (Example-2): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN etc). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-2): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Cross Validation Model Code Block (Example-2)

if vAR_Fetched_Data_Cross_Validation_Required =='Y':

#from sklearn import datasets

from sklearn.model_selection import cross_val_predict

from sklearn.linear_model import LogisticRegression

import matplotlib.pyplot as vAR_plt

vAR_model = LogisticRegression()

vAR_Predicted = cross_val_predict(vAR_model, vAR_Features_train, vAR_Label_train , cv=5)

vAR_fig, vAR_ax = vAR_plt.subplots()

vAR_ax.scatter(vAR_Label_train[:20], vAR_Predicted[:20], edgecolors=(0, 0, 0))

vAR_ax.plot([vAR_Label_train[:20].min(), vAR_Label_train[:20].max()], [vAR_Label_train[:20].min(), vAR_Label_train[:20].max()], 'k--', lw=4)

vAR_ax.set_xlabel('Actual Intercompany Transaction')

vAR_ax.set_ylabel('Predicted Intercompany Transaction')

## plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Cross_Validation_Image_Example_2)

Cross Validation Model Plotted (Example-2)

Cross Validation Model Plotted

Training Data Set (Example-3): The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN etc). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically

Training Data Set

Test Data Set (Example-3): Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set
Cross Validation Model Code Block (Example-3)

if vAR_Fetched_Data_Cross_Validation_Required =='Y':

#from sklearn import datasets

from sklearn.model_selection import cross_val_predict

from sklearn.linear_model import LogisticRegression

import matplotlib.pyplot as vAR_plt

vAR_model = LogisticRegression()

vAR_Predicted = cross_val_predict(vAR_model, vAR_Features_train, vAR_Label_train , cv=10)

vAR_fig, vAR_ax = vAR_plt.subplots()

vAR_ax.scatter(vAR_Label_train[:15], vAR_Predicted[:15], edgecolors=(0, 0, 0))

vAR_ax.plot([vAR_Label_train[:15].min(), vAR_Label_train[:15].max()], [vAR_Label_train[:15].min(), vAR_Label_train[:15].max()], 'k--', lw=4)

vAR_ax.set_xlabel('Actual Intercompany Transaction')

vAR_ax.set_ylabel('Predicted Intercompany Transaction')

##plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Cross_Validation_Image_Example_3)

Cross Validation Model Plotted (Example-3)

Cross Validation Model Plotted
Hyperparameter Tuning

Hyperparameter Optimization  or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. The same kind of machine learning model can require different constraints, weights or learning rates to generalize different data patterns. These measures are called hyperparameters, and have to be tuned so that the model can optimally solve the machine learning problem. Hyperparameter optimization finds a tuple of hyperparameters that yields an optimal model which minimizes a predefined loss function on given independent data.

Training Data Set

The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN etc.). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically.

Training Data Set

Test Data Set

Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Hyperparameter Tuning Code Block Before Tuning

# Hyperparameter Tuning

if vAR_Fetched_Data_Hyperparameter_Tuning_Required =='Y':

import matplotlib.pyplot as vAR_plt

vAR_le = LabelEncoder()

vAR_Transaction_Type_Conversion = vAR_le.fit_transform(vAR_df.iloc[:,7])

vAR_Transaction_Type_Conversion_df = vAR_pd.DataFrame(vAR_Transaction_Type_Conversion,columns={'Transaction_Type_Converted'})

vAR_Data_Category_Conversion = vAR_le.fit_transform(vAR_df.iloc[:,9])

vAR_Data_Category_Conversion_df =  vAR_pd.DataFrame(vAR_Data_Category_Conversion,columns={'Data_Category_Converted'})

vAR_Features_train = vAR_pd.read_excel(vAR_Fetched_Data_Train_All_Features)

vAR_Label_train = vAR_df.iloc[:,12]

]vAR_model = LogisticRegression()

]vAR_model.fit(vAR_Features_train,vAR_Label_train)

vAR_plt.scatter(vAR_Features_train.iloc[:,0],vAR_Label_train,s=100, c='gbc')

vAR_df3 = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Test_Data)

vAR_Transaction_Type_Conversion_test = vAR_le.fit_transform(vAR_df3.iloc[:,3])

vAR_Transaction_Type_Conversion_test_df =   vAR_pd.DataFrame(vAR_Transaction_Type_Conversion_test,columns={'Transaction_Type_Converted'})

vAR_Data_Category_Conversion_test = vAR_le.fit_transform(vAR_df.iloc[:,4])

vAR_Data_Category_Conversion_test_df = vAR_pd.DataFrame(vAR_Data_Category_Conversion_test,columns={'Data_Category_Converted'})

vAR_df4 = vAR_df3.merge(vAR_Transaction_Type_Conversion_test_df,left_index=True, right_index=True)

vAR_df5 = vAR_df4.merge(vAR_Data_Category_Conversion_test_df,left_index=True, right_index=True)

vAR_Features_test = vAR_pd.read_excel(vAR_Fetched_Data_Test_All_Features)

vAR_Labels_Pred = vAR_model.predict(vAR_Features_test)

vAR_Labels_Pred = vAR_pd.DataFrame(vAR_Labels_Pred,columns={'Predicted_Inter_Transaction_Type'})

vAR_df6 = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Test_Data)

vAR_df7 = vAR_df6.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_df8 = vAR_df7.to_excel(vAR_Fetched_Data_Model_Path, engine='xlsxwriter')

vAR_df9 = vAR_pd.read_excel(vAR_Fetched_Data_Model_Path)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5],c='b') vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted Intercompany Transaction')

#vAR_plt.show()

vAR_plt.savefig(vAR_Fetched_Data_Before_Hyperparameter_Tuning_Image)

Hyperparameter Tuning Plotted Before Tuning

Hyperparameter Tuning Plotted Before Tuning

Training Data Set

The training data set is the actual dataset used to train the model for performing various Deep Learning Operations (DNN, CNN, RNN etc.). This is the actual data with which the models learn with various API and algorithm to train the machine to work automatically.

Training Data Set

Test Data Set

Test data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on. Actually, such data is used for testing the model whether it is responding or working appropriately or not.

Test Data Set

Hyperparameter Tuning Code Block After Tuning

if Fetched_Data_Hyperparameter_Tuning_Required =='Y':

import matplotlib.pyplot as vAR_plt

vAR_le = LabelEncoder()

vAR_Transaction_Type_Conversion = vAR_le.fit_transform(vAR_df.iloc[:,7])

vAR_Transaction_Type_Conversion_df = vAR_pd.DataFrame(vAR_Transaction_Type_Conversion,columns={'Transaction_Type_Converted'})

vAR_Data_Category_Conversion = vAR_le.fit_transform(vAR_df.iloc[:,9])

vAR_Data_Category_Conversion_df = vAR_pd.DataFrame(vAR_Data_Category_Conversion,columns={'Data_Category_Converted'})

vAR_Features_train = vAR_pd.read_excel(Fetched_Data_Train_All_Features)

vAR_Label_train = vAR_df.iloc[:,12]

vAR_model = LogisticRegression(C=10.0, fit_intercept=False, warm_start=True)

vAR_model.fit(vAR_Features_train,vAR_Label_train)

vAR_plt.scatter(vAR_Features_train.iloc[:,0],vAR_Label_train,s=100, c='gbc')

vAR_df3 = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Test_Data)

vAR_Transaction_Type_Conversion_test = vAR_le.fit_transform(vAR_df3.iloc[:,3])

vAR_Transaction_Type_Conversion_test_df =  vAR_pd.DataFrame(vAR_Transaction_Type_Conversion_test,columns={'Transaction_Type_Converted'})

vAR_Data_Category_Conversion_test = vAR_le.fit_transform(vAR_df.iloc[:,4])

vAR_Data_Category_Conversion_test_df = vAR_pd.DataFrame(vAR_Data_Category_Conversion_test,columns={'Data_Category_Converted'})

vAR_df4 = vAR_df3.merge(vAR_Transaction_Type_Conversion_test_df,left_index=True, right_index=True)

vAR_df5 = vAR_df4.merge(vAR_Data_Category_Conversion_test_df,left_index=True, right_index=True)

vAR_Features_test = vAR_pd.read_excel(Fetched_Data_Test_All_Features)

vAR_Labels_Pred = vAR_model.predict(vAR_Features_test)

vAR_Labels_Pred = vAR_pd.DataFrame(vAR_Labels_Pred,columns={'Predicted_Inter_Transaction_Type'})

vAR_df6 = vAR_pd.read_excel(vAR_Fetched_Data_Source_Path_Test_Data)

vAR_df7 = vAR_df6.merge(vAR_Labels_Pred,left_index=True, right_index=True)

vAR_df8 = vAR_df7.to_excel(vAR_Fetched_Data_Model_Path, engine='xlsxwriter')

vAR_df9 = vAR_pd.read_excel(vAR_Fetched_Data_Model_Path)

vAR_plt.plot(vAR_df9.iloc[:,0],vAR_df9.iloc[:,5],c='b')

vAR_plt.xlabel('Company')

vAR_plt.ylabel('Predicted Intercompany Transaction')

#vAR_plt.show()

vAR_plt.savefig(vAR_Fetched_Data_After_Hyperparameter_Tuning_Image)

Hyperparameter Tuning Plotted After Tuning

Hyperparameter Tuning Plotted After Tuning

Content Developer

Our team is comprised of MIT facilitators, Harvard PhD’s, Stanford Alumni's, leading management consulting experts, industry leaders and proven entrepreneurs. Collectively, our team brings business and technology together with risk-free implementation of artificial intelligence for enterprise.
Customers’ Vocal Endorsements
We have been delivering impactable products and services on artificial intelligence, data engineering, finance, analytics, training and talent development for every business function. We work closely with senior executives as well as technical developers.


Download Request
Contact

Point of Contact

Jothi Periasamy
Chief AI Architect


Address

2100 Geng Road
Suite 210
Palo Alto
CA 94303


Contact e-Mail

lnfo@DeepSphere.AI


Contact Phone

(916)-296-0228


Web

https://www.deepsphere.ai

Join our Newsletter list to get all the latest articles, posts and other benefits