Category Archives: Uncategorized

Cloud Native Application Design – Infrastructure Security

When talking about security in the cloud, we can broadly categorize it into the following three areas.

Infrastructure Security
Application Security
Data Security

Infrastructure Security

Infrastructure security is about making sure that the infrastructure we are using is accessed only by authorized personnel. This is about both physical and virtual access. An important aspect when it comes to cloud security is understanding that it is a shared responsibility of the public cloud provider and development team.

Physical security: At the lowest level of security, one needs to consider the fact that physical machines can be accessed and tampered with. This is more possible when we have on-premise hardware infrastructure than on the cloud. But even when one is choosing a cloud platform, it makes sense to understand and question the level of physical security implemented by the cloud service provider to avoid any unauthorized access. This aspect is handled by the cloud service providers as part of shared responsibility.
Virtual access to infrastructure/ Role-based access (RBAC): The next level of access is someone gaining virtual access to the machines manually, programmatically, or through malware. Role-based access to make sure only authorized personnel or code can access data, having security groups and firewalls in place, and making sure security patches and antivirus definitions are always updated can help mitigate this threat.
Use Virtual Networks: Create Virtual Networks to group together resources needed by an application. For example, if a service API can only be accessed by an API gateway or a database should only be accessed by a particular microservice, we can make sure these components are in a virtual network and cannot be accessed from the outside world.
Manual Errors/ Infrastructure as a Code: A misconfiguration causing a VM exposed through unwanted open ports can be another problem. Implementing infrastructure as a code where automated scripts are responsible for creating and maintaining infrastructure can be helpful in avoiding manual errors.
Storage/ Data Access: Who can access a service or a filesystem or a database? What kind of access is required? How can the resources be accessed? One needs to answer these questions before getting started with the application development process. Role-based access is an important tool that can help architects making sure proper security. For example, a user or an application might just need read access on the file system or database, then rules should not allow any read or update access.
Audit Tracing: Most cloud service providers allow you to see any changes being done on infrastructure. You can monitor which resources were updated by whom and when. This is an important tool for teams to keep a track of changes.

OWASP Top 10 Security Threats

Here are the top 10 OWASP (Open Web Application Security Project) security threats- https://owasp.org/www-project-top-ten/

Broken Access Control: Proper Access control checks are not implemented at each layer of the application. One example is users can update the API and fetch data they do not have access to, /employee/{Id}, provide an {Id} manually and get the data. Additionally, users can POST, PUT, and DELETE data when they do not have access (because there is no check on the API level). Other Use cases are- when the user is able to manipulate JWT tokens to enhance privilege or CORS misconfiguration allows untrusted origin access.

Cryptographic Failures: Data in transit is not encrypted via HTTPS and TLS. Sensitive data like passwords is not encrypted. Data at rest is not encrypted. Sensitive information is not masked. Not strong enough encryption algorithms.

Injection: Data received is not sanitized for injections. Proper escaping and sanitization are missing in queries. SQL query format not analyzed for injections.

Logs: Ensure log data is encoded correctly. Ensure high-value transactions have an audit trail. Ensure all login, access control, and server-side input validation failures are logged.

Vulnerable and outdated components: If underlying components or libraries are not kept up to date, this will increase the risk of vulnerabilities in the system.

Identification and Authentication Failure: Handling automated attacks or script attacks. Weak passwords. Not using multifactor authentication. Not invalidating old sessions and tokens.

Security Misconfiguration: Unnecessary ports are kept open, default accounts are not closed, and security patches are not applied.

Software and Data Integrity: Confirm that the data source is correct through a digital signature.

Insecure Design: Best practices like threat modeling are not being followed.

Server Side Request Forgery: Fetching a remote resource without validating the user-supplied URL

Getting Started with R

There are multiple tools to get started with R, I have explored with Anaconda as that gives me flexibility of using Python in the same IDE. Within Anaconda you can either install R studio or use Jupyter notebook.

Once you have installed Anaconda, go to command prompt and create a new environment

conda env create -f requirements/my-environment.yml

After that activate the environment

source activate my-environment

OR
create a notebook

jupyter notebook test_R_notebook.ipynb

Once you have your R up and running, either in R studio or Jupyter notebook, here are a few basic commands to get started.

# Read file
mydata<-read.csv("path/filename.csv")
or
mydata<-read.csv("path/filename.csv", header=TRUE)
# Print data
mydata

# converting a text data to integers 
employeeDataNum$department<-as.numeric(employeeDataNum$department)

# remove NA values 
mydata<-na.omit(mydata)
mydata

# Replace NA with average
mydata$column[is.na(mydata$column)] <- round(mean(mydata$column, na.rm = TRUE))

# Plot bars for items 
data_subset<-mydata[c(7,8:20)]
data_subset<-ifelse(data_subset=='yes', 1,0)
barplot(data_subset)

# plot for boxlot
boxplot(data$column)


Check your library paths
Sys.getenv("R_LIBS_USER")

#Install a package 
install.packages("AER")
#with dependencies
install.packages("AER", dependencies=TRUE)
#include a library
library(dplyr)

# Getting specific columns
datanew<-mydata[,c(7,8,9,10)]

Divide data set into training and test sets
set.seed(4)
inTraing<-sample(2,nrow(mydata),prob=c(0.7,0.3),replace=T)
trainset<-mydata[inTraing==1,]
testset<-mydata[inTraing==2,]

# Applying alog on training data
linermodel<-lm(trainset$Other_players~.,data = trainset)
linermodel

# Predict for test ddata
predict<-predict(linermodel,testset)

# plot
testsubset<-testset[1:100,]
plot(testsubset$Other_players[1:100], type="l")
lines(predict[1:100],col="red")

# Finding correlation among columns
correlation <- cor(mydata)
install.packages('corrplot', dependencies=TRUE)
library(corrplot)
corrplot(correlation,type='lower')

# Subsetting data based on some conditiom
employee_left<-subset(employeeData, left==1)
employee_left

# More plotting
plot(employeeData$salary)
hist(employeeData$last_evaluation)

# Summary
summary(employeeData)

# creating decision tree
library(rpart)
my_tree<-rpart(formula = formulacolumn ~ .,data=traindata)
plot(my_tree, margin=0.1)
text(my_tree,pretty=T,cex=0.7)

# Confusion matrix

predtree<-predict(my_tree,testdata,type="class")
install.packages('e1071', dependencies=TRUE)
library(caret)
confusionMatrix(table(predtree,testdata$leftlibrary(randomForest)))

# using random forest for analysis
library(randomForest)
employee_forest<-randomForest(left~.,data=traindata)
predforest<-predict(employee_forest,testdata,type="class")
confusionMatrix(table(predforest,testdata$left))

# using naive bayes
library(e1071)
employee_naive<-naiveBayes(left~.,data=traindata)
pred_naive<-predict(employee_naive,testdata,type="class")
confusionMatrix(table(pred_naive,testdata$left))

# using svm
employee_svm<-svm(left~.,data=traindata)
pred_svm<-predict(employee_svm,testdata,type="class")
confusionMatrix(table(pred_svm,testdata$left))

Tuning Tomcat

There can be cases when you face problems like server overloaded or underloaded but requests being rejected by Tomcat (or any other application/ web server). All these issues drill down to incorrect tuning of the server. Here is an interesting case study https://medium.com/netflix-techblog/tuning-tomcat-for-a-high-throughput-fail-fast-system-e4d7b2fc163f

From my personal experience, I found a few important parameters to be considered (specific to tomcat but other servers might have similar values)
maxThreads=”50″
maxConnections=”50″
acceptCount=”100″

maxThreads are actual worker threads which will actually execute the request or perform the requested operations. Setting this up correctly is tricky, as a value too high, means a lot of processing, hence CPU and memory can choke up. On the other hand, a value too low would mean we are not using server capabilities completely but still refusing requests as our all available threads are busy.

maxConnections are connections server is accepting. This will mostly depend on traffic you are expecting.

acceptcount is beyond maxConnections. Any requests which cannot be accommodated as a new connection will wait in a queue whose size is provided by acceptcount. If a request is received beyond acceptcount, it will be rejected by server.

In short, the total number of requests a server can handle at a time is acceptcount + maxconnections. And maxthread are actually threads fulfilling these requests.

More details- https://tomcat.apache.org/tomcat-7.0-doc/config/http.html
https://www.mulesoft.com/tcat/tomcat-connectors

Using Java class based configuration with Spring

In last few posts about Sring, I have used XML based configuration. But offlate I have figured out that it is easier to use Java based configuration. Here is how it is done for a simple spring mvc application

1. Firstly you will tell web.xml that you will use which class for configuring spring

<servlet>
<servlet-name>dispatcherServlet</servlet-name>
<servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
<init-param>
<param-name>contextClass</param-name>
<param-value>org.springframework.web.context.support.AnnotationConfigWebApplicationContext</param-value>
</init-param>
<init-param>
<param-name>contextConfigLocation</param-name>
<param-value>com.myapp.config.MyConfig</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>

<servlet-mapping>
<servlet-name>dispatcherServlet</servlet-name>
<url-pattern>/app/*</url-pattern>
</servlet-mapping>

2. You will mark your class with annotation @Configuration

3. For MVC application @EnableWebMvc

4. For component scan- to tell your spring application where to find compoenent @ComponentScan(basePackages = “com.app.my com.app.service com.app.aspect”)

5. If you are using AOP @EnableAspectJAutoProxy

So a sample file would look like

@Configuration
@EnableAspectJAutoProxy
@EnableWebMvc
@ComponentScan(basePackages = "com.app.my com.app.service com.app.aspect")
public class MyConfig extends WebMvcConfigurerAdapter {


	@Bean
	public UrlBasedViewResolver urlBasedViewResolver() {
		UrlBasedViewResolver res = new InternalResourceViewResolver();
		res.setViewClass(JstlView.class);
		res.setPrefix("/WEB-INF/");
		res.setSuffix(".jsp");

		return res;
	}

	@Override
	public void addResourceHandlers(final ResourceHandlerRegistry registry) {

		registry.addResourceHandler("/fonts/**")
				.addResourceLocations("/fonts/").setCachePeriod(31556926);
		registry.addResourceHandler("/css/**").addResourceLocations("/css/")
				.setCachePeriod(31556926);
		registry.addResourceHandler("/images/**")
				.addResourceLocations("/images/").setCachePeriod(31556926);
		registry.addResourceHandler("/js/**").addResourceLocations("/js/")
				.setCachePeriod(31556926);
	}


	@Bean
	public CommonsMultipartResolver multipartResolver() {
		CommonsMultipartResolver mr = new CommonsMultipartResolver();
		mr.setMaxUploadSize(50000000);
		return mr;
	}


}