Competency 3.2: Perform social network analysis and visualize analysis results in Gephi.

Here are few visualizations related to Modularity , Closeness , Betweenness measures in GEPHI

Modularity Class Betweenness Cntrality Closeness,Betweeness Centrality


Competency 3.1: Define social network analysis and its main analysis methods.

Social Network Analysis is defined as the mapping of connections between the nodes or actors (peoples) , communities (Organizations) and etc based on few measures.

Density is one of the measure, which means the actual connections to the potential connections.

There are other centralities like In  and Out degree , betweenness ( nodes which are brokers between two different communities ) , closeness ( nodes which have shortest path to other nodes in the community ) .

Modularity is another measure which will help to analyze the nodes in communities as per the resolution provided as in GEPHI  .  < 1 will have more communities and > 1 will have less communities.

Competency 2.3: Evaluate the impact of policy and strategic planning on systems-level deployment of learning analytics

Learning Analytics will help in real time assessment , inturn focused on improving individuals learning skills . This cannot be achieved with no good leadership , proper strategic plans.Planning on systems-level deployment of learning analytics depends on the size of the data sets . We need required data to deal with analytics in small or big data and Whats the role of the data .

Competency 2.2: Download, install, and conduct basic analytics using Tableau software.

I have downloaded and installed Tableau Software . After done with watching start up videos on Tableau . I have worked on the dataset “InstructorFeedback.csv” . Cleaned up everything which is unnecessary like removed blank rows , columns etc and then saved the file . Initialized Tableau and connected to the saved data .

Then I have decided few questions , then to answer those , used Tableau accordingly .

No. of students enrolled in each course as per the semester in all the years?

No. of students enrolled in each course as per the semester ?

No. of responses for each course as per semester wise ?

To verify whether instructor mean eval course and course mean eval score are equal or not?

Please find below screenshot :


Competency 2.1: Describe the learning analytics data cycle.

Step 1:Collection & Acquisition

This is the first task in the cycle to work on,it depends on your questions on data to answer. There are enormous data sources ( LMS,excel spread sheets, text files and etc) from where data is collected .
Step 2:Storage

Second step on how you will store data that you have collected . Usually it stores data in the tool where you do analysis work .
Step 3:Cleaning

Data Cleaning is a process where one need to transform the data into a format where analysis on that data takes place. It includes removing blank rows , renaming the unique column names for identifying , defining formats of the data and etc.
Step 4:Integration

Once the data is cleaned , then we tend to link diverse datasets being used . This link can be done via identifying unique records in the datasets to relate to each other .
Step 5:Analysis

Analyzing the data mean using the compiled data from Step1 through Step4 to answer your questions using tools available .
Step 6:Representation and Visualization

Analysed Data can be visualized via many sources (Like Tableau etc) . This will help clients/customers to understand easily on the trend .

Step 7:Action

Based on trends, one can take necessary steps to help improve respective course standards.

Competency 1.2: Define learning analytics and detail types of insight they can provide to educators and learners.

Learning analytics is the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs .

Insight to Educators:

  • Assume All Students at Risk at All Times to reduce attrition
  • Assume Students Present Different Levels of Risk at Different Times to provide any help .
  • Leverage on observation from data to support high quality learning design and improved curriculum process
  • Able to provide learners , few recommendations on their performances.

Insight to Learner:

  • To keep an eye on the progress of the course without missing deadline
  • Evaluate the results topic wise , can concentrate more on topics in which he/she scored less as per the recommendations from the instructor.
  • Will be able to watch out the difficulty level beforehand and able to fulfill prerequisites to master the difficulties.

Competency 1.1:Identify proprietary and open source tools commonly used in learning analytics.

Tool (URL) Description Opportunities in Learning Analytic Solutions Weaknesses/Concerns/ Comments
Data Cleansing/IntegrationPrior to conducting data analysis and presenting it through visualizations, data must be acquired (extracted), integrated, cleansed and stored in an appropriate data structure. We will look at this in more detail in Week 2 of #DALMOOC. Given the need for both structured and unstructured data, the ideal tools will be able to access and load data to and from data sources including RRS feeds, API calls, RDMS and unstructured data stores such as Hadoop.
Pentaho Integration Pentaho Data Integration (PDI) is a powerful easy to learn open source ETL tool that supports acquiring data from a variety of data sources including flat files, relational databases, Hadoop databases, RSS Feeds, and RESTful API calls.  It can also be used to cleanse and output data to the same list of data sources. PDI provides a versatile ETL tool that can grow with the evolution of an institutions learning analytics program.  For example, initially a LA program may start with institutional data that is easily accessible via institutional relational databases.  As the program grows to include text mining and recommendation systems that require extracting unstructured data outside the institution, the skills developed with PDI will accommodate the new sources of data collection and cleansing. Two concerns with with PDI:
1. Pentaho does not have built in integration with R statistics.  Instead Pentaho data mining integration focuses on a WEKA module.
2.  Pentaho is moving away from the open source model.  Originally PDI was an open source ETL tool called Kettle developed by Matt Casters.  Since Pentaho acquired Kettle (and Matt Caster), it has become a central piece to their subscription based BI Suite and the support costs are growing at a rapid pace.
Lavastorm Analytics Copied from the Lavastorm website is the agile data management and analytics company trusted by enterprises seeking an analytic advantage. Our data discovery platform empowers business professionals and analysts with the fastest, most accurate way to discover and transform insights into business improvements, while providing IT with control over data governance. Gain greater control for analyzing data against complex business logic and for manipulating data from Excel, CSV or ASCII files Faster. Better. Cheaper. Most organizations are turning to analytics to achieve these business goals. And increasingly, they are demanding that their analytics infrastructure and people can manage more data, respond with more speed, adapt to more change, and support more decision makers. Agility and flexibility are no longer choices – they are requirements.The Lavastorm Analytics Platform and its Lavastorm Analytics Engine help you break through these limitations with a new, agile way to analyze, optimize, and control your data and processes with  BI reports, spreadsheets, databases, visualization tools, and traditional audit software .
Tabula Tabula is free and available under the MIT open-source license. Tabula lets you upload a (text-based) PDFfile into a simple web interface and magically pull tabular data intoCSV format. Tabula website a clear explanation of how to use the tool to extract tabular data from pdf documents. Limitations copied from the Tabula website – correct as of 27.10.14:Scanned PDFs: Tabula only works on text-based PDFs only, so you’re still stuck with manual labor if you have scanned PDFs. FreeOCR technology is not quite to the point where we’d trust automating it with many pages of data. For those files, Raleigh Public Record’s DocHive is worth a look.Multi-line rows: PDFs with multi-line rows (word wrapped text) are often mis-detected, particularly in tables without graphic row separators.

Automatic table detection:We’re working on automating the detection of tables. For now, you’ll have to do a manual rectangular selection around the candidate tables.


complex table. Tabula can correctly deal with the portion outlined in red.

Complex Tables: Tables can get complex. Deriving afunctional representation(data schema) from tablesis no easy task. Tabula works best with tables that don’t contain rows or columns spanning several cells.


Statistical ModelingThere are three major statistical software vendors:  SAS, SPSS (IBM) and R.  All three of these tools are excellent for developing analytic/predictive models that are useful in developing learning analytics models.  This section focuses on R.  The open source project R has numerous packages and commercial add-ons available that position it well to grow with any LA program.  R is commonly used in many data/analytics MOOCs to help learners work with data. We opted for Tableau during week 1 & 2 due to ease of use and relatively short learning curve.
R R is an active open source project that has numerous packages available to perform any type of statistical modeling. R statistics strength is the fact that it is a widely used by the research community.  Code for analysis is widely available and there are many packages available to help with any type of analysis and presentation that might be of interest. Some of these include:

  • Visualization:
    1. ggplot provides good charting functionality.
    2. googlevis provides an interface between R and the Google Visualization API
  • Text Mining:
    1. tm provides functions for manipulating text including stripping whitespace and stop words and removing suffixes (stemming).
    2. openNLP identifies words as nouns, verbs, adjectives or adverbs
Two issues that may be of concern to some universities:

  • Lack of Support – only Revolution R provides support for the R product
  • High Level of Expertise Required to Develop and Maintain R. How does a university retain people that have the skill required to develop and maintain  R/RevoDeployR.
SAS SAS (Statistical Analysis System) is a software suite developed by SAS Institute for advanced analytics, business intelligence, data management, and predictive analytics. It is the largest market-share holder for advanced analytics plenty of learning materials on SAS. Also SAS offers various certification options as mentioned in their website. SAS suite is highly expensive as compared to open source tools like R. So small and medium size companies prefer R to SAS. Also, R is considered superior to SAS in terms of data visualization.
Network AnalysisNetwork Analysis focuses on the relationship between entities.  Whether the entities are students, researchers, learning objects or ideas, network analysis attempts to understand how the entities are connected rather than understand the attributes of the entities.  Measures include density, centrality, connectivity, betweenness and degrees. We will spend time in #DALMOOC on these topics in week 3 & 4.
SNAPP Social Networks Adapting Pedagogical Practice (SNAPP)  is a network visualization tool that is delivered as a ‘bookmarklet’ .  Users can easily create network visualizations from LMS (Blackboard, Moodle, and D2L) forums in real time. Self Assessment Tool for StudentsSNAPP provide students with easy access to network visualizations of forum posting.  These diagram can help students understand their contribution to class discussions.Identify at Risk Students/ Monitor Impact of Learning Activity

Network Analysis visualizations can help faculty identify students that may be isolated.   They can also be used to see if specific activities have impacted the class network.



Gephi Gephi is the ideal platform for leading innovation about dynamic network analysis (DNA). Dynamic structures, such as social networks can be filtered with the timeline component.
Import temporal graph with the GEXF file format Graph streaming ready
To produce visualizations of dynamic,complex systems and hierarchical graphs, Gephi needs data arranged into two GEXF fileformats. One file must contain information about the learners in vertices or nodes. The

other file contains information about the interactions between the participants, called the

edges, arranged by the communication source, target and type. The provision of files

organized in a different way from other LMS tools is a common data organization

requirement for SNA tools.

CourseVis The CourseVis tool used trace data to visualize learners’ social, cognitive, andbehavioral data, then presented the relationships between the three en masse to educators The ability of visualizations to help individuals formmental models and thus a better understanding of the data presented is an informationvisualization theoryCourseVis uses scatterplots and matrixes along

with color, proximal placement, rotation and perspective projection (Tufte, 1990) to

present multidimensional data for exploration (Mazza & Dimitrova, 2004)

Learners couldn’t benefit from same information.
Knowledge Building Discourse



SNAvisualizationto examine

how learners



built into



Sociallearning Visualizations may be difficult for novices.
Linked DataIf Tim Berners-Lee vision of linked data ( is successful in transforming the internet into a huge database, the value of delivering content via courses and programs will diminish and universities will need to find new ways of adding value to learning.  Developing tools that can facilitate access to relevant content using linked data could be one way that universities remain relevant in the higher learning sector.
Ontologiese.g. DBPedia Ontologies are essentially an agreed upon concept map for a particular domain of knowledge. Dynamically Deliver Relevant Content
VisualizationThe presentation of the data after it has been extracted, cleansed and analyzed is critical to successfully engage students in learning and acting on the information that is presented.
Google Visualization API’s( Google Visualization provides an API to their chart library allowing for the creation of charts and other visualizations.  They have recently released an API to add interactive controls to their charts. Interactive Learning DashboardsAll of these tools are useful for creating visualizations for learning feedback systems such as dashboards.All of these tools can present data as a heat maps, network analysis diagrams and tree maps.  Here’s a link to an example dashboard created in D3, presenting university admission data.
Learning how to use these tools/libraries requires a fair amount of effort. Developer retention is a risk for system maintenance and enhancement.
Protovis and D3 are JavaScript frameworks for creating  web-based visualizations.  Protovis is no longer an active open source project.  It has been replaced by D3.
Open Heat Map Open Heat Map allows the user to upload a spreadsheet and create an interactive online map automatically Interactive Learning Dashboards All of these tools are useful for creating visualizations for learning feedback systems such as dashboards.
All of these tools can present data as a heat maps, network analysis diagrams and tree maps.  Here’s a link to an example dashboard created in D3, presenting university admission data. distribution of learning data over time
Chart.js Chart.js allows you to create responsive and interactive charts. There are 6 chart types and it used the HTML5 canvas element.Open Source  Interactive Learning Dashboards All of these tools are useful for creating visualizations for learning feedback systems such as dashboards.
All of these tools can present data as a heat maps, network analysis diagrams and tree maps.  Here’s a link to an example dashboard created in D3, presenting university admission data. distribution of learning data over time.
coding knowledge needed.Learning how to use these tools/libraries requires a fair amount of effort. Developer retention is a risk for system maintenance and enhancement.