Guy Bar-Gil – Mend

Closing the Loop on Python Circular Import Issue

Guy Bar-Gil — Thu, 30 Nov 2023 14:51:00 +0000

Introduction

Python’s versatility and ease of use have made it a popular choice among developers for a wide range of applications. However, as projects grow in complexity, so do the challenges that developers face. One such challenge is the notorious “Python circular import issue.” In this article, we will explore the intricacies of circular imports, the problems they can pose, and the strategies to effectively address and prevent them, enabling you to write cleaner and more maintainable Python code. Whether you’re a seasoned Python developer or just starting, understanding and resolving circular imports is a crucial skill in ensuring the robustness and scalability of your projects.

What is the Python circular import issue?

In Python, the circular import issue arises when two or more modules depend on each other in a way that creates a loop of dependencies. Imagine Module A needing something from Module B, and Module B needing something from Module A, leading to a tangled web of imports. This situation can result in a perplexing challenge for Python interpreters, often manifesting as an ImportError. Let’s illustrate this with a simple example:

# module_a.py
import module_b

def function_a():
    return "This is function A in Module A"

print(function_a())
print(module_b.function_b())

# module_b.py
import module_a

def function_b():
    return "This is function B in Module B"

print(function_b())
print(module_a.function_a())

In this example, module_a.py imports module_b.py, and vice versa. When you run module_a.py, you’ll encounter an ImportError due to the circular import between the two modules. This circular dependency can lead to confusion and hinder the smooth execution of your Python code.

Understanding circular dependencies and their causes

Circular dependencies often result from poor code organization or a lack of modularization in your Python project. They can be unintentional and tend to emerge as your codebase grows in complexity. Let’s explore some common scenarios that lead to circular dependencies and their underlying causes:

Importing Modules That Depend on Each Other Directly or Indirectly

Circular dependencies often stem from situations where modules directly or indirectly depend on each other. Here’s a different example to illustrate this scenario:

# employee.py
from department import Department

class Employee:
    def __init__(self, name):
        self.name = name
        self.department = Department("HR")

    def display_info(self):
        return f"Name: {self.name}, Department: {self.department.name}"

# main.py
from employee import Employee

employee = Employee("Alice")
print(employee.display_info())

# department.py
from employee import Employee

class Department:
    def __init__(self, name):
        self.name = name
        self.manager = Employee("Bob")

    def display_info(self):
        return f"Department: {self.name}, Manager: {self.manager.name}"

In this example, the employee.py module imports the Department class from department.py, and the department.py module imports the Employee class from employee.py. This creates a circular dependency where each module relies on the other, potentially leading to a circular import issue when running the code.

Understanding and recognizing such dependencies in your code is crucial for addressing circular import issues effectively.

Circular References in Class Attributes or Function Calls

Circular dependencies can also arise when classes or functions from one module reference entities from another module, creating a loop of dependencies. Here’s an example:

# module_p.py
from module_q import ClassQ

class ClassP:
    def __init__(self):
        self.q_instance = ClassQ()

    def method_p(self):
        return "This is method P in Class P"

print(ClassP().method_p())

# module_q.py
from module_p import ClassP

class ClassQ:
    def __init__(self):
        self.p_instance = ClassP()

    def method_q(self):
        return "This is method Q in Class Q"

print(ClassQ().method_q())

In this case, ClassP from module_p.py references ClassQ from module_q.py, and vice versa, creating a circular dependency.

A Lack of Clear Boundaries Between Modules

When your project lacks well-defined module boundaries, it becomes easier for circular dependencies to sneak in. Without a clear separation of concerns, modules may inadvertently rely on each other in a circular manner.

Understanding these common causes of circular dependencies is essential for effectively addressing and preventing them in your Python projects. In the following sections, we will explore various strategies to mitigate and resolve circular imports.

Issues with circular dependencies

Circular dependencies in Python code can introduce a multitude of problems that hinder code readability, maintainability, and overall performance. Here are some of the key issues associated with circular dependencies:

Readability and Maintenance Challenges: Circular dependencies make your codebase more complex and difficult to understand. As the number of intertwined modules increases, it becomes increasingly challenging to grasp the flow of your program. This can lead to confusion for developers working on the project, making it harder to maintain and update the codebase.

Testing and Debugging Complexity: Debugging circular dependencies can be a daunting task. When an issue arises, tracing the source of the problem and identifying which module introduced the circular import can be time-consuming and error-prone. This complexity can significantly slow down the debugging process and increase the likelihood of introducing new bugs while attempting to fix the existing ones.

Performance Overhead: Circular imports can lead to performance overhead. Python has to repeatedly load and interpret the same modules, which can result in slower startup times for your application. While this may not be a significant concern for smaller projects, it can become a performance bottleneck in larger and more complex applications.

Portability Concerns: Circular dependencies can also impact the portability of your code. If your project relies heavily on circular imports, it may become more challenging to reuse or share code across different projects or environments. This can limit the flexibility of your codebase and hinder collaboration with other developers.

Code Smells and Design Issues: Circular dependencies are often a symptom of poor code organization and design. They can indicate that modules are tightly coupled, violating the principles of modularity and separation of concerns. Addressing circular dependencies often involves refactoring your code to adhere to better design practices, which can be time-consuming and require a significant effort.

How to fix circular dependencies?

When you encounter circular import issues in your Python code, it’s essential to address them effectively to maintain code clarity and reliability. In this section, we’ll explore various strategies to resolve circular dependencies, ranging from restructuring your code to preventing them in the first place. Let’s dive into each approach:

Import When Needed

One straightforward approach to tackling circular dependencies is to import a module only when it’s needed within a function or method. By doing this, you can reduce the likelihood of circular dependencies occurring at the module level. Here’s an example:

# module_a.py
def function_a():
    return "This is function A in Module A"

# module_b.py
def function_b():
    from module_a import function_a  # Import only when needed
    return f"This is function B in Module B, calling: {function_a()}"

# main.py
from module_b import function_b

print(function_b())

In this example, function_b imports function_a only when it’s called. This approach can help break the circular dependency.

Import the Whole Module

Another strategy is to import the entire module rather than specific attributes or functions. This can help avoid circular imports because you’re not referencing specific elements directly. Consider this approach:

# module_a.py
def function_a():
    return "This is function A in Module A"

# module_b.py
import module_a  # Import the whole module

def function_b():
    return f"This is function B in Module B, calling: {module_a.function_a()}"

# main.py
from module_b import function_b

print(function_b())

Here, module_b imports module_a as a whole, and then function_b can access function_a without causing circular dependencies.

Merge Modules

In some cases, modules that are tightly coupled can be merged into a single module. This consolidation can eliminate circular dependencies by containing everything within a single module. Here’s an example of merging modules:

# merged_module.py
def function_a():
    return "This is function A in the merged module"

def function_b():
    return f"This is function B in the merged module, calling: {function_a()}"

# main.py
from merged_module import function_b

print(function_b())

In this scenario, both function_a and function_b are defined in the same module, eliminating the possibility of circular imports.

Change the Name of the Python Script

Renaming the Python script can sometimes break circular imports. By altering the import path, you can resolve circular dependency issues. Here’s an example:

# module_alpha.py
import module_beta

def function_alpha():
    return "This is function Alpha in Module Alpha"

print(function_alpha())
print(module_beta.function_beta())

# module_beta.py
import module_alpha_renamed  # Renamed the script

def function_beta():
    return "This is function Beta in Module Beta"

print(function_beta())
print(module_alpha_renamed.function_alpha())

In this example, renaming module_alpha.py to module_alpha_renamed.py changes the import path in module_beta.py, resolving the circular import issue. These strategies offer practical solutions to address and prevent circular dependencies.

How to avoid circular imports in Python?

Preventing circular imports is often more effective than trying to fix them after they occur. Python offers several techniques and best practices to help you avoid circular imports in your codebase. Let’s explore each of these strategies:

Use “import . as .”

You can use relative imports with the syntax import . as . to specify that you want to import from the current package. This approach can help you avoid importing the same module from different locations. Here’s an example:

# package/module_a.py
from . import module_b

def function_a():
    return "This is function A in Module A"

# package/module_b.py
from . import module_a

def function_b():
    return "This is function B in Module B"

# main.py
from package.module_a import function_a
from package.module_b import function_b

print(function_a())
print(function_b())

By using relative imports (from . import …), you ensure that modules within the same package reference each other without causing circular dependencies.

Use Local Imports

Whenever possible, use local imports within functions or methods instead of global imports at the module level. This limits the scope of the import and reduces the risk of circular dependencies. Here’s an example:

# module_c.py
def function_c():
    from module_d import function_d  # Local import
    return f"This is function C in Module C, calling: {function_d()}"

# module_d.py
def function_d():
    return "This is function D in Module D"

# main.py
from module_c import function_c

print(function_c())

In this scenario, function_c locally imports function_d only when needed, avoiding global circular imports.

Use Python’s importlib or __import__() Functions

Python’s importlib module provides fine-grained control over imports, allowing you to dynamically load modules when needed. Similarly, the __import__() function can be used to achieve dynamic imports. These approaches enable you to import modules dynamically and avoid circular dependencies.

Use Lazy Imports

Lazy loading involves importing modules only when they are needed. Libraries like importlib and importlib.util provide functions to perform lazy imports, which can help mitigate circular import issues. Lazy loading is especially useful for improving the startup time of your application.

Leverage Python’s __main__ Feature

In some cases, you can move code that causes circular dependencies to the if __name__ == ‘__main__’: block. This ensures that the problematic code is only executed when the script is run as the main program. This technique allows you to isolate the problematic code, preventing circular dependencies from affecting other parts of your program.

Move Shared Code to a Separate Module

Identify shared code that multiple modules depend on and move it to a separate module. By centralizing shared functionality, you can reduce interdependencies between modules, making it easier to manage your codebase and prevent circular imports.

Reorganize Your Code

Consider restructuring your code to create clear boundaries between modules. Good code organization can go a long way in preventing circular imports. By following the principles of modularity and separation of concerns, you can design a more robust and maintainable codebase.

Move the Import to the End of the Module

Sometimes, moving the import statements to the end of the module can resolve circular import issues. By defining functions and classes before performing imports, you ensure that the necessary elements are available when needed.

Conclusion

In conclusion, addressing and preventing circular imports in Python is a crucial skill for any developer aiming to write clean, maintainable, and efficient code. Circular dependencies can introduce a myriad of challenges, from code readability and debugging complexities to performance bottlenecks. However, armed with the strategies and best practices outlined in this article, you can confidently tackle circular import issues in your projects.

Remember that prevention is often the best cure. By structuring your code thoughtfully, using relative imports, and embracing lazy loading, you can significantly reduce the likelihood of circular dependencies. When they do arise, a combination of import reorganization and modularization can help you untangle the web of dependencies. With these tools at your disposal, you can close the loop on Python circular import issues and pave the way for robust and scalable Python projects.

Getting Started with npm Basics: Mastering Node Package Manager Commands

Guy Bar-Gil — Thu, 16 Nov 2023 22:26:00 +0000

Introduction

The Internet as we know it today is structured and streamlined extensively using Javascript. Single-page to highly sophisticated interactive websites are built from the ground up using JavaScript as a base language. Such precedence places javascript at the top of the chain, demanding prodigious approaches and solutions. Modern problems require efficient and performant solutions, and while building these is straightforward, packaging and offering them is where the true complexity lies.

The vastness of the Javascript ecosystem and the demand surge of usage brought numerous tools into the market to address the limitations. Node package manager (npm) took the spotlight and became a promising and battle-tested tool to manage and maintain Javascript packages, especially Node.js – a cross-platform environment exclusively developed with a robust runtime to create server-side javascript web applications that run and execute outside the browser.

What is npm?

The npm (node package manager) is an efficient and transcendent offering that helps overcome the complexities of packaging and distributing Javascript modules and making them available through a centralized registry to host them as installable npm dependencies.

In a nutshell, npm allows developers to build, host, and interact with shareable Javascript packages rapidly, with enhanced functionality and code reliability guarantees. The package lifecycle can be handled efficiently using three components – npm Website, Registry, and CLI.

Website

The npm website acts as a centralized user interface that makes it simple to access features, packages, libraries, and administrative options within the npm ecosystem.

The website offers an extensive view of code, dependencies, version history, and package metadata with a usage guidance readme and stats with capabilities for public and private repository isolation.

Registry

The registry is the centralized hub mounted on CouchDB hosting the npm packages for querying and upserts. npm packages are generally queried via name, version, or scope. Behind the scenes, npm communicates with the registry to resolve packages by name or version to read the package information.

By default, npm points to a public registry hosting general-purpose and open-source packages with customizable configurations to leverage private registry capabilities. Registries are boundless, as one can use any compatible or custom registry (public or private) to host and offer npm packages.

CLI

The CLI (command line interface) is a tool/utility bundled with dependency handling, package management, deployment, and administrative control capabilities.

The command line tool offers programmatic access to all npm features, enabling project management to manage dependencies and automation. Developers rely excessively on npm CLI to maintain package integrity and manage dependencies.

Getting started with npm

The rise of Node.js took the web paradigm by storm. The runtime environment has become a go-to option for developers intending to build cutting-edge modules using robust and sound principles that are battle-tested for security and performance. Understanding the npm domain and its internals holds weight if the aim is to build robust and resilient applications.

The walk-through is aimed at ensuring the reader has a thorough understanding of npm and the ecosystem and is equipped with the tools and best practices to make an impact with their innovative ideas.

Node.js installation and setup

Node.js is an OS-agnostic runtime environment that enables developers to tap into the capabilities of JavaScript from outside the browsers. Based on the underlying OS, there are multiple approaches for installing Node.js.

The installable binary/package can be downloaded from the Node.js official download page for setup and configuration. The installer comes in two flavors. The current flavor – with new and untested features (sometimes) and LTS (long-time support) – with stable and secure features.

To install Node.js based on OS, we can download tarball for Linux, a PKG for Mac, or MSI for Windows and install them post-decompression by following standard OS-specific package installation practices. Or install Node.js and Node package manager using the default command.

#linux
sudo apt install nodejs npm

#mac
brew install node

#windows
#follow UI wizard instructions

Post execution, successful installation of the latest version of Node.js and npm can be verified using the following command.

#Node.js version validation
$ node -v
 V20.2.0

#npm version validation
$ npm -v
 9.6.6

Updating npm

Node package manager is an active project, with new features released periodically. Keeping up with the npm version and migrating the old version project’s dependencies is vital to catch up with the evolving market demands and introduce new features.

npm is a command-line tool inherently pre-packed with inbuilt commands to self-update the package version and update the dependencies. The npm install command implements a predefined flow of operations to update the version.

#upgrade npm to the latest version

npm install npm

Tip: An npm module npm-upgrade can be leveraged to easily update outdated dependencies with changelog inspection support and eliminate manual dependency upgrades.

Basic npm commands

npm is the root-level entry point to work with a suite of commands for handling and administering the modules. Project setup to deployment and beyond can be achieved as npm consists of everything a developer needs to get going.

Let us explore essential npm basics and how using npm commands architects and navigates developers to build sophisticated packages:

npm init

Modules designed and developed without structure are hard to manage and maintain. Every application needs a source of truth containing mandatory project information for reference. The npm package parses the package.json file that’s composed of necessary project metadata and configuration. For Node.js projects, the package.json file is the source of truth that gets created during the initialization of the project.

#Initialize the project
npm init

#Initialize the project with default values
npm init -y

npm search and info

Searching for relevant packages and knowing about package information before installing is crucial to understanding what the package offers through the metadata. npm search and info commands help retrieve identical package metadata using package names and module descriptions in detail.

npm install

Dependencies are the puzzle blocks responsible for shaping the final version of the project. A common pattern in Node.js applications is that they are interconnected with other publicly available promising modules in the npm registry. Most developers generally avoid the reinvention of the wheel by installing and importing public modules.

This decision introduces performant features that are tested and secure in the applications, resulting in better project structure and debugging. The command npm install will download and install packages, ensuring project dependency setup in the package.json file.

#Install a package
npm install

npm start and stop

Node.js runtime offers full-stack functionality with both client-side and server-side development. The need to control and explore the application behavior locally and on the server becomes vital. To orchestrate Node.js packages, npm equips developers with a configuration in the package.json file to modulate the behavior of application start and stop via scripts.

"scripts": {
  "start": "node server.js",
  "stop": "pkg stop server.js"
}

Placing the configuration enables npm to call the parent command by parsing the JSON and referring to the Key.

#start Node.js application
npm start

#stop Node.js application
#npm stop

Tip: npm run is a handy option to trigger custom scripts from the configuration file and npm restart is useful to restart a package when unexpected behavior is observed.

NPX

The requirement sometimes demands running one-time custom commands or using a specific version (old) of the package without installing globally to test out functionality. NPX makes it possible to run arbitrary commands from an npm package.

#run a specific version of npm package without installing it globally
npx @

How to initiate the first project with npm

The primary stage of the Node.js application is to initialize it with npm for dependency management. npm init transforms the general project folder into an npm module with all the npm benefits. The resulting outcome of calling npm init is the package.json file. The package.json file is essential for managing dependencies, scripts, and metadata.

The file metadata is dependent on the keyed inputs during initialization. Although the metadata can be altered, it is important to declare the key-value pairs mindfully to avoid inconsistencies or irregularities in the application.

Package.json metadata properties

Understanding the metadata properties of the module is necessary to get a better grasp of npm. Initialization of npm will ensure the generation of the package.json file and the keyed values are committed to the configuration file as metadata in the current directory.

{
 "name": "npm_test",
 "version": "0.1.0",
 "description": "This is a sample npm project",
 "main": "index.js",
 "scripts": {
   "test": "exec"
 },
 "repository": {
   "type": "git",
   "url": "git+https://github.com/account/repo.git"
 },
 "keywords": [
   "test",
   "sample",
   "mlops"
 ],
 "author": "youremail@gmail.com",
 "license": "MIT",
 "bugs": {
   "url": "https://github.com/account/repo/issues"
 },
 "homepage": "https://github.com/account/repo#readme"
}

The example metadata is self-explanatory, having the project name, version, and description. The main callable js file to execute the Node.js application with the scripts that start, stop, test, and perform other administrative actions are added to scripts and main properties. The GitHub repository reference to maintain version control and log bugs are added in the repository and bugs section. The creator and usage restrictions are added via author and license properties.

Package.json dependencies management

All the packages installed in the application are listed under the dependency section of the package.json file. The npm manages the dependencies behind the scenes by default.

"dependencies": {
 "react": "^18.2.0"
}

Every npm install will add a new entry to the dependency dictionary. Adding — save-dev or -D to the npm install command ensures the packages are utilized only during development.

Installing packages

Modern web apps are dependent on various modules to operate optimally. This requirement makes the command npm install one of the most used among others. npm packages can be installed variously depending on the weightage of a respective module in local and remote execution.

Let us witness the different ways to install packages. We will try to work with the express module. A mature module to handle routes and manage servers.

Installing packages locally

The recommended approach is to use local package installation whenever possible to keep the modules isolated and avoid unexpected behaviors in other projects in your system directory. The npm install will install packages locally in a sub-directory named node_modules specific to the current project’s directory.

#install express module
npm install express

#using syntactic shortcut
npm i express

Installing global packages

Popular modules like the express module that is useful in most applications make it a repetitive task to install packages in every project. The optimal solution is to ensure the package is installed globally and available system-wide, eliminating developer dependency.

#install express module globally
npm install -g express

Tip: Packages installed globally may lead to conflicts across projects. Ensure identical version usage is carried out in all the projects.

Installing a specific package version

The npm registry is indexed to enable the installation of the major version when the package name is called. Most current package versions are unstable, posing security problems and breaking existing functionality. Installing all the dependencies by dynamically passing the version is recommended.

#install 4.18.1 version of express
npm install express@4.18.1

Installing from package.json

Dependency resolution is an approach that can be leveraged to predefine and install the modules and their versions if the necessary information is determined beforehand. When the list of critical dependencies is listed in package.json, the command npm install will set up the packages we want to install.

List of installed packages

Extracting the list of packages is necessary to understand the dependency hierarchy for debugging dependency conflicts, package version issues, and logging. Outdated packages can be pinpointed directly from npm CLI or by extracting the list to an output file.

#list npm packages
npm list

#list a specific package information
npm list 

#export the list to an output file
npm list --json

Updating packages

The major versions of the module that are both stable and compatible will be released periodically. To implement and experiment with new features, the module needs to be updated. The npm update command will update the package and place the reference code locally.

#update npm package
npm update

Uninstalling packages

The size and dependency hierarchy of the module play a vital part in optimization and I/O. By default, npm takes care of removing unused packages through a process called pruning. But to maintain a compact and performant module, uninstalling packages from the dependency hierarchy is crucial.

#uninstall a package
npm uninstall

npm performs an extra step post uninstallation by parsing all dependencies to verify if it is no longer used or dependent on the uninstalled package. If yes, the unused package will be removed as well.

Package-lock.json

Node.js being a platform and OS agnostic, module installation guarantees are a priority to ensure the applications run consistently with reproducible dependency trees, irrespective of the underlying abstraction.

Package-lock.json ensures module version and hierarchy lock to attain deterministic builds, security, and resolution.

Semantic versioning

Promising packages bring new features and fix issues in an order. The domain-specific capabilities and restructuring of the overall package happen on the major version (X.0.0). Small and new features are introduced into root packages through minor versions (X.Y.0), and the bugs are fixed and released over patch versions (X.Y.Z). X, Y, and Z are the version pointers.

Semantic versioning is a scheme that ensures the usage of reliable and compatible changes in the application. To allow minor level changes, Caret (^) range is used — “¹.4.1”. Through Tilde (~) range — “~1.4.1” patch-level fixes are allowed, and with no range symbol, a major version will be allowed to be used.

Audit

Security management and vulnerability remediation is a key aspect of web applications. Node.js applications handle sensitive/PII information and serve it post-communication with the server. Bulletproofing the security posture is a mandate. npm audit is a brilliant offering to scrape through the project’s dependencies and flag security issues and their severity.

#Command to conduct security audit
npm audit

The npm audit is a valuable command for locating and mitigating security vulnerabilities with robust monitoring capabilities.

Cache

A factor that contributes to degraded application performance is Network IO. Leveraging capabilities to eliminate repetitive and expensive operations can boost response time and functionality. The npm cache clean helps maintain a cached copy of the module for reuse.

#add npm package cache
npm cache add

Tip: Using npm install generally ensures a cached version is available in the default location of local packages.

Using npm packages

Using npm packages is straightforward through mature and definitive documentation available for public reference. Hands-on and how-to solutions are at the developer’s disposal for rapid lift and shift implementations.

Getting help

The npm is a vast domain with a thriving community. There are many means to seek help and assistance from the mature offering. npm help command is the first choice to dive into the internal workings of the npm installed packages, what they offer, and how they operate.

The npm communities and official documentation are available to find the most common problems and solutions (sometimes), and Stackoverflow can be a reliable source to seek help from npm experts and practitioners.

Tip: You can always open an issue on the npm GitHub repository and seek help from the maintainers or report bugs.

Conclusion

The tides of web development are shifting with the rise of the modern tech stack. Technological advancements are putting enterprises in a tough spot to innovate. One-size-fits-all solutions are hard to come by, but Node.js emerged as a game changer, enhancing performance and overcoming security challenges.

Python Import: Mastering the Advanced Features

Guy Bar-Gil — Wed, 08 Nov 2023 22:23:28 +0000

In the ever-evolving landscape of Python programming, the ‘import’ statement stands as a foundational pillar, enabling developers to harness the full power of the language’s extensive libraries and modules. While beginners often start with simple imports to access basic functionality, delving deeper into Python’s import system unveils a world of advanced features that can significantly enhance code organization, reusability, and maintainability.

In this exploration of “Python Import: Mastering the Advanced Features,” we will embark on a journey beyond the basics, diving into techniques such as relative imports, aliasing, and the intricacies of package structure. By the end of this journey, you’ll not only be well-versed in the nuances of Python’s import capabilities but also equipped with the skills to build modular and extensible code that can withstand the complexities of real-world software development.

Basic Python import

In the realm of Python programming, the ‘import’ statement serves as the gateway to a vast ecosystem of pre-written code that can save developers both time and effort. At its core, importing allows you to access and utilize functions, classes, and variables defined in external files, known as modules. Modules are the fundamental building blocks of Python’s import system, and understanding them is the first step towards mastering advanced import features.

Modules

In Python, a module is essentially a file containing Python statements and definitions. These files can include functions, classes, and variables that you can reuse in your own code. When you import a module, you gain access to its contents, making it easier to organize and manage your codebase. Python’s standard library is a treasure trove of modules that cover a wide range of functionalities, from handling data structures to working with dates and times. Understanding how to import and utilize these modules effectively is a crucial skill for any Python developer.

# Example of importing a module from the standard library
import math
# Using a function from the math module
result = math.sqrt(25)
print(result)  # Output: 5.0

Packages

As your Python projects grow in complexity, you’ll often find yourself working with more than just individual modules. Enter packages. Packages are a way to organize related modules into a directory hierarchy, making it easier to manage large codebases. By mastering packages, you can structure your projects in a modular and organized manner, improving code readability and maintainability.

# Example of importing a module from a package
from mypackage import mymodule

# Using a function from the imported module
result = mymodule.my_function()
print(result)

Absolute and relative imports

Python offers two primary ways to import modules and packages: absolute and relative imports. Absolute imports specify the complete path to the module or package you want to use, while relative imports reference modules and packages relative to the current module. Understanding when to use each type of import is crucial for writing clean and maintainable code.

# Absolute import
from mypackage import mymodule

# Relative import
from . import mymodule

Python’s import path (standard library, local modules, third party libraries)

To import modules successfully, Python relies on a search path that includes directories for the standard library, local modules, and third-party libraries. Learning how Python manages this import path is essential for resolving import errors and ensuring your code can access the required modules. Whether you’re working with built-in modules, your own project-specific modules, or external libraries, understanding Python’s import path is a key aspect of mastering advanced import features.

# Checking the sys.path to see the import search path
import sys
print(sys.path)

Third-party libraries are a valuable part of the Python ecosystem, allowing developers to quickly and easily add new features and functionality to their applications. However, third-party libraries can also introduce security vulnerabilities into a project.

Structuring your imports

Code organization is a vital aspect of software development, and structuring your imports can greatly impact the readability and maintainability of your code. Establishing a clear and consistent import style not only makes your code more accessible to other developers but also helps you navigate your own projects more efficiently. We’ll explore best practices for structuring your imports to ensure your codebase remains clean and comprehensible.

# Organizing imports according to PEP 8 style guide
import os
import sys

# Importing standard library modules
import math
import datetime

# Importing third-party libraries
import requests
import pandas as pd

# Importing local modules
from mypackage import mymodule

Namespace packages

Namespace packages are a lesser-known but valuable feature of Python’s import system. They allow you to create virtual packages that span multiple directories, providing a flexible way to organize and distribute your code. Mastering namespace packages can be especially beneficial when working on large and collaborative projects.

# Namespace package example
# mypackage/__init__.py
__path__ = __import__('pkgutil').extend_path(__path__, __name__)

# Now you can have modules in different directories under 'mypackage'
from mypackage.subpackage import module1
from mypackage.anotherpackage import module2

Imports style guide

To maintain a high level of code quality and consistency across projects, adhering to an imports style guide is essential. We’ll delve into recommended conventions and best practices for naming, organizing, and documenting your imports. By following a style guide, you can ensure that your code remains clean, readable, and accessible to other developers.

# Imports should be grouped and separated by a blank line
import os
import sys

import math
import datetime

import requests
import pandas as pd

from mypackage import mymodule

In the world of Python import statements, these advanced features are the keys to unlocking greater code organization, reusability, and maintainability. As we delve deeper into each topic with code examples, you’ll gain a comprehensive understanding of how to harness the full potential of Python’s import system and elevate your programming skills to the next level.

Resource imports

As Python applications continue to expand in complexity and diversity, the need to manage and incorporate external resources becomes increasingly important. Whether you’re dealing with data files, images, or other non-Python assets, mastering resource imports is an essential skill for any developer. This section explores advanced import features related to resources, including the introduction of importlib.resources and practical applications like using data files and adding icons to Tkinter graphical user interfaces (GUIs).

Introducing importlib.resources

Python 3.7 introduced the importlib.resources module, which provides a streamlined and Pythonic way to access resources bundled within packages or directories. This module offers a unified API to access resources regardless of whether they are packaged within a Python module or exist as standalone files on the file system.

# Example of using importlib.resources to access a resource in a package
import importlib.resources as resources
from mypackage import data

# Access a resource file 'sample.txt' in the 'data' package
with resources.open_text(data, 'sample.txt') as file:
    content = file.read()
    print(content)

Using data files

Data files are a common type of resource in software development. Whether it’s configuration files, CSV data, or text files, you often need to read and manipulate these files within your Python applications. By mastering resource imports, you can efficiently access and utilize data files, enhancing the functionality of your programs.

# Reading data from a text file using resource import
import importlib.resources as resources
from mypackage import data

# Access and read a data file 'config.ini'
with resources.open_text(data, 'config.ini') as file:
    for line in file:
        print(line.strip())

Adding icons to Tkinter GUIs

Graphical user interfaces (GUIs) are a cornerstone of modern software development, and incorporating icons into your Tkinter-based applications can significantly enhance their visual appeal. Resource imports come into play when you want to bundle icons or image files with your application and access them seamlessly. Here’s how you can use resource imports to add icons to your Tkinter GUIs.

# Adding an icon to a Tkinter window using resource import
import tkinter as tk
import importlib.resources as resources
from mypackage import icons

root = tk.Tk()
root.title("My GUI")

# Load and set the application icon
icon_path = resources.resource_filename(icons, 'my_icon.ico')
root.iconbitmap(default=icon_path)

# Create and configure GUI components here

root.mainloop()

Mastering resource imports, as demonstrated through importlib.resources, empowers you to efficiently manage and incorporate non-Python resources into your Python projects. Whether you need to access data files, images, or icons for GUIs, these advanced import features provide a consistent and reliable way to enrich your applications with external assets, enhancing both their functionality and aesthetics.

Dynamic imports

In the world of Python, sometimes you encounter situations where you don’t know the exact modules or packages you need until runtime. Dynamic imports, enabled by the importlib module, provide a powerful way to load and utilize modules dynamically based on program logic. This section dives deep into dynamic imports using importlib, showcasing how this advanced feature can make your Python applications more flexible and adaptable.

Using importlib

The importlib module, introduced in Python 3.1, offers a programmatic way to work with imports. It allows you to load modules, packages, and even submodules dynamically, giving your code the ability to make decisions about which code to use at runtime. Here’s an overview of how to use importlib for dynamic imports:

import importlib

# Dynamic import of a module
module_name = "mymodule"
module = importlib.import_module(module_name)

# Access functions or classes from the dynamically imported module
result = module.my_function()

Dynamic imports are particularly useful in scenarios where you have multiple implementations of a feature, and you want to choose one at runtime based on conditions like user input, configuration settings, or the environment.

import importlib

# Determine which module to import based on user input
user_choice = input("Enter 'A' or 'B': ")

if user_choice == 'A':
    module_name = "module_A"
elif user_choice == 'B':
    module_name = "module_B"
else:
    print("Invalid choice")
    sys.exit(1)

try:
    module = importlib.import_module(module_name)
    result = module.perform_action()
except ImportError:
    print(f"Module {module_name} not found.")

By embracing dynamic imports through importlib, your Python applications can become more adaptable and versatile, capable of loading and using modules, while making your codebase more resilient to changes and customizable according to runtime conditions.

The Python import system

The Python import system is a fundamental aspect of the language that enables developers to access and incorporate external code into their programs. While many are familiar with the basic mechanics of importing modules, mastering the advanced features of the import system opens up a world of possibilities.

In this section, we explore the intricacies of the Python import system, including techniques like importing internals, using singletons as modules, reloading modules, understanding finders and loaders, and even automating the installation of packages from PyPI. Additionally, we delve into the less conventional but equally powerful concept of importing data files, which can be a game-changer for many applications.

Importing internals

Python allows you to import not only external modules but also internal parts of a package or module. This feature can be incredibly useful when you want to organize your codebase into submodules and selectively expose certain components to the outside world. By mastering this technique, you can achieve a fine-grained control over what parts of your code are accessible to other developers.

# Importing an internal submodule
from mypackage.internal_module import my_function

Singletons as modules

In Python, modules are singletons by design, meaning that they are loaded only once per interpreter session. This property makes modules suitable for storing and sharing data across different parts of an application. By mastering the concept of singletons as modules, you can create global variables or shared resources that remain consistent throughout your program’s execution.

# Creating a singleton module for configuration settings
# config.py
database_url = "mysql://user:password@localhost/mydb"

# main.py
import config
print(config.database_url)  # Access the configuration settings

Reloading modules

Python’s import system allows you to reload modules dynamically during runtime. This capability is especially valuable during development when you want to test and iterate on code changes without restarting your entire program. By mastering module reloading, you can streamline your development workflow and reduce the need for frequent application restarts.

# Reloading a module
import mymodule
# ... make changes to mymodule ...
importlib.reload(mymodule)  # Reload the module to apply changes

Finders and loaders

Behind the scenes, Python employs a sophisticated mechanism of finders and loaders to locate and load modules. Understanding these components of the import system can provide insights into how Python locates and uses modules. While you may not need to interact with finders and loaders directly in most cases, having a grasp of these concepts is valuable for troubleshooting import-related issues.

Automatically installing from PyPI

The Python Package Index (PyPI) is a vast repository of third-party packages that can enhance the functionality of your Python applications. Mastering the ability to automatically install packages from PyPI within your code can simplify the setup process for your projects and ensure that all required dependencies are available.

# Automatically installing a package from PyPI using pip
import subprocess

package_name = "requests"
subprocess.check_call(["pip", "install", package_name])
import requests

Importing data files

Beyond code, Python’s import system can be extended to handle data files. This advanced feature allows you to bundle and access non-code resources like text files, configuration files, and more within your Python projects. By mastering the art of importing data files, you can create self-contained and versatile applications that can seamlessly incorporate external data.

# Importing data from a text file
import importlib.resources as resources
from mypackage import data

with resources.open_text(data, 'config.ini') as file:
    config_data = file.read()

In the world of Python import statements, mastering these advanced features of the import system can take your programming skills to the next level. From controlling internal imports to managing data files and automating package installations, these techniques empower you to create more efficient, organized, and powerful Python applications.

Python import tips and tricks

In the journey to master the advanced features of Python imports, it’s essential to equip yourself with a toolkit of tips and tricks to navigate real-world scenarios effectively. This section delves into a range of strategies and solutions that can help you handle package compatibility across Python versions, address missing packages using alternatives or mocks, import scripts as modules, run Python scripts from ZIP files, manage cyclical imports, profile imports for performance optimization, and tackle common real-world import challenges with practical solutions.

Handling Packages Across Python Versions: Python’s continuous development results in version discrepancies between packages. To maintain compatibility across various Python versions, consider using tools like ‘six’ or writing platform-specific code that dynamically adapts to the Python version being used.

Handling Missing Packages Using an Alternative: Occasionally, a required package may not be available or suitable for your project. In such cases, you can explore alternative packages or libraries that offer similar functionality. Properly handling missing packages ensures your project remains functional and adaptable.

Handling Missing Packages Using a Mock: During development or testing, you may encounter situations where a package isn’t readily available. Mocking the missing package’s functionality can help you continue working on your code without disruptions. Libraries like ‘unittest.mock’ are invaluable for creating mock objects.

Importing Scripts as Modules: In some cases, you might want to reuse code from Python scripts as if they were modules. You can achieve this by encapsulating the script’s functionality into functions or classes and then importing those functions or classes into other Python files.

Running Python Scripts from ZIP Files: When working on distribution or deployment, you may need to bundle multiple Python scripts into a ZIP archive for easier distribution. Python’s ‘zipimport’ module allows you to import and run code directly from ZIP files, simplifying the distribution and execution process.

Handling Cyclical Imports: Cyclical imports, where modules depend on each other in a loop, can lead to confusion and errors. To address this, refactor your code to eliminate cyclical dependencies or use techniques like importing modules locally within functions to break the circular references.

Profile Imports: For performance optimization, profiling your imports can provide insights into bottlenecks in your code. Tools like ‘cProfile’ can help you identify which modules are taking the most time to import and address potential performance issues.

By mastering these tips and tricks for Python imports, you can tackle a wide range of import-related challenges that arise in your development journey. These strategies not only enhance code reliability and maintainability but also empower you to adapt to changing requirements and evolving Python ecosystems effectively.

Conclusion

Mastering the advanced features of Python imports is akin to unlocking the hidden potential of this versatile programming language. The journey through Python imports has revealed the rich tapestry of possibilities that await developers who seek to harness the full power of this language feature. Whether you’re building applications, managing dependencies, or optimizing for performance, a deep understanding of advanced import techniques empowers you to write more efficient, organized, and adaptable code.

As you continue your Python programming journey, remember that mastering imports is not just about writing code—it’s about crafting resilient solutions to real-world challenges. By applying the tips and tricks explored in this guide and staying curious in your pursuit of knowledge, you’ll be well-equipped to face the complexities of modern software development with confidence and creativity.

Preventing SQL Injections With Python

Guy Bar-Gil — Thu, 22 Dec 2022 13:54:56 +0000

For Python developers, it is essential to protect your project from potential SQL injection attacks. SQL injection attacks happen when malicious SQL code is embedded into your application, allowing the attacker to indirectly access or modify data in the database. As you’re probably already aware, such an attack can have disastrous consequences, like data theft and loss of integrity, which is why preventing SQL injection attacks is critical for any web application.

In this blog post, we will see how you can protect your applications from SQL injection attacks when working with Python. We will also discuss some common techniques that attackers use to exploit SQL injection vulnerabilities, and give you some important tips to prevent such attacks.

Learn More:

Understanding SQL injections in Python

An SQL injection is a type of security exploit in which malicious code is inserted into strings that are later passed to an instance of SQL Server for parsing and execution. This malicious code can be used to manipulate the behavior of the database and potentially gain access to sensitive information.

SQL injections exploit vulnerabilities in the way applications interact with databases. An attacker can insert a command into a web app’s user input, which is then passed to the underlying database for execution. This allows the attacker to bypass authentication mechanisms and gain access to sensitive data or modify its contents.

For example, an attacker could enter a malicious command that would execute a delete query from a database. The command gets sent to the database for execution, potentially granting the attacker access to data or simply allowing them to delete data from the database.

Most commonly, hackers will insert malicious SQL commands into user-supplied input fields, such as search boxes and login forms, . An attacker might also use a tool to dynamically generate malicious code and send it to the database.

Luckily, Python provides several methods for preventing SQL injection attacks.

Good practices to prevent SQL injections

There are a few common mistakes developers make that increase the risk of SQL injections, including poor coding practices, lack of input validation, or insecure database configuration. However, with good Python coding practices, developers can drastically reduce this risk.

To prevent SQL injections from occurring in your application, apply these good habits:

Always use parameterized queries when interacting with a database. This ensures that user input is never directly passed to the database, which reduces the risk of an injection attack.

Implement input validation on all user-provided data. This helps to prevent malicious commands from even being executed in the application. By validating the user input, you can quickly detect any suspicious activity and stop it before it even gets to the server, let alone the database. You can do this with the help of try-catch blocks (or try-except ones in Python).

Ensure that the database is properly configured by implementing the right access restrictions and authentication mechanisms. This includes limiting the access of users with specific permissions and implementing reliable authentication processes.
Regularly update your code and any third-party packages you’re using. By regularly patching up your code and checking for vulnerabilities, you ensure that any gateway for an attacker is closed. Since SQL injections are so common, they are frequently patched in most softwares whenever a new version appears, so staying up to date is essential.
Use an intrusion detection system. Although it’s often overlooked, using an intrusion detection system to monitor any suspicious activity or attempts of injection attacks is highly beneficial. These can help identify any malicious activity that could indicate an attack in progress.
Perform regular security audits of your app’s code and database configuration. This enables you to identify any potential vulnerabilities that could be exploited in an attack.

Learn More: Most Secure Programming Languages

Final thoughts

SQL injection is a serious threat to web applications and can have devastating consequences if left unchecked. Preventing SQL injections while working with Python is not that difficult, as long as you stay up-to-date with the latest coding practices and follow the tips we mentioned above. Making sure that your app is secure is not only important for users, but it also helps maintain the company’s good reputation.

Asynchronous Programming in Python – Understanding The Essentials

Guy Bar-Gil — Thu, 22 Dec 2022 13:25:03 +0000

Asynchronous programming in Python is the process of writing concurrent code that runs asynchronously – i.e. doesn’t take place in real-time. It allows an app instance to execute multiple tasks at the same time, or in parallel. This helps speed up the required processing time because tasks can run simultaneously.

Asynchronous programming can be leveraged in Python to make applications more resilient, flexible, and efficient. Tasks performed asynchronously often maintain responsiveness in programs and prevent blocking the main thread. This accelerates response time when dealing with multiple tasks at once.

Let’s dive deeper into what asynchronous programming is, when to perform it, and how to implement it.

What is asynchronous programming?

Asynchronous programming refers to a form of multitasking that allows for faster execution of programs and tasks by dividing a single task into smaller chunks of code. This approach makes it easier for a program to process multiple requests at once, allowing the user to make more efficient use of their time and resources. It allows developers to create complex software applications with minimal effort.

In Python, asynchronous programming is based on the event-driven programming model. This involves using ‘callbacks’, or functions that are triggered as soon as an event occurs. These functions can be used to perform a wide variety of tasks, like making an HTTP request, sending a notification, or even executing some long-running code without blocking the main thread.

Usually, asynchronous code in Python is directly related to an event loop that needs to be triggered. This loop runs continuously and checks for any new events that need to be processed. Once an event is detected and the loop is triggered, the async code will call the appropriate callback function.

It’s worth mentioning that there’s a solid number of reliable Python packages like AsyncIO and Twisted, which make it much easier to write asynchronous code. These libraries provide a range of tools that enable developers to create efficient and scalable applications.

When do I need asynchronous execution?

Asynchronous programming is primarily used for applications that require a high degree of concurrency. In most cases, this includes web applications that need to handle thousands of requests simultaneously, or even some applications that require the execution of long-running tasks. Since asynchronous programming can allow for the creation of more responsive and event-driven apps, it can significantly improve user experience.

Python is an ideal language for developing applications with asynchronous execution. Asynchronous programming allows developers to take advantage of Python’s high-level syntax and object-oriented style. This makes it easier to write code that is more efficient, faster, and easier to maintain. Some use cases that work very well with asynchronous execution include:

Web-based applications that need to handle many requests simultaneously.
Smooth user experience for real-time applications such as online gaming or video streaming.
Data processing applications that need to execute long-running tasks in the background.
Distributed systems and microservices architectures.

To get the most out of asynchronous programming in Python, it is important to understand how the event loop works and how to use callbacks. It is also beneficial to understand tools such as AsyncIO and Twisted that can make writing asynchronous code much easier.

Learn More: Most Secure Programming Languages

Implementing async code in Python

Python includes several modules that simplify the process of writing and managing asynchronous code. Thesy provide powerful features such as error handling, cancellation and timeouts, and thread pools.

The AsyncIO module is the most popular Python library for implementing asynchronous code. It provides a range of tools that make it easier to write and maintain asynchronous code. This includes features such as the event loop, coroutines, futures, and more.

Just like AsyncIO, Twisted library is another popular Python library for asynchronous code. It provides a range of features such as an event-driven networking engine, threading, process management, and more. However, unlike AsyncIO, Twisted is more server- and network-oriented.

To get started, junior developers should familiarize themselves with the basics of asynchronous programming and then explore the various tools available in Python. With practice, developers can write code that is more efficient and easier to maintain.

Final thoughts

Asynchronous programming can be an incredibly valuable tool for both new and experienced Python developers. It can make applications more responsive, and improve their user experience. By understanding the basics of asynchronous programming, developers can take advantage of powerful features such as the event loop, coroutines, futures, and more. With practice, developers that master it can create applications that are responsive, efficient, and scalable.

What Are The Key Considerations for Vulnerability Prioritization?

Guy Bar-Gil — Wed, 21 Dec 2022 20:31:18 +0000

When it comes to open source vulnerabilities, we seem to be in permanent growth mode. Indeed, data from Mend’s Open Source Risk Report showed 33 percent growth in the number of open source software vulnerabilities that Mend added to its vulnerability database in the first nine months of 2022 compared with the same time period in 2021. However, while some vulnerabilities pose a severe business risk — hello, log4j — others can be safely ignored. The question is, how do you effectively prioritize vulnerabilities? When prioritizing vulnerabilities, start by evaluating a vulnerability in terms of the following six factors:

1. Severity

This is arguably the most obvious consideration. Every vulnerability is classified in the Common Vulnerabilities and Exposures (CVE) list and is given a Common Vulnerability Score (CVSS) that expresses its severity. Generally, the higher the severity, the higher the priority to fix the vulnerability. However, that is not always the case. For instance, a CVSS score may take some time to be assigned to a new vulnerability, so zero-day vulnerabilities may slip below the radar, so to speak. Also, a vulnerability only poses risks if it’s associated with a component or dependency that you use in your code. If not, then it doesn’t threaten your code base.

Example CVSS Scores:

2. Exploitability

Some vulnerabilities are easily used, or exploited in attacks, making them likely to be used by threat actors. A vulnerability with a potentially severe impact can have low exploitability, while a less severe vulnerability might be easily and frequently exploitable. In this case, the less severe vulnerability may pose a higher risk of breach, and it would be prudent to prioritize it.

3. Reachability

Vulnerabilities are only exploitable if they’re reachable. In other words, when attackers can find a clear path from the code to the vulnerability. If you’re calling the vulnerable code, then the vulnerability is potentially exploitable. When there’s no path, there are no direct calls from the code to the vulnerability. Some vulnerabilities are found in code that’s not executed by your software or application, so they’re not reachable from your code. It would be a waste of time and resources to target these. Better to prioritize those that are reachable and therefore more easily exploitable.

4. Business risk

Another important question to ask is, What business risk does your software or application hold?” This consideration primarily revolves around data, particularly financial and personally identifiable data for customers. Information of this kind is valuable to malicious actors and will be a target for their attacks. So, vulnerabilities in software and applications that handle such data are prime candidates for prioritization.

5. Application usage

Similarly, you should consider how the software or application you’ve developed will be used. Is it a marginal or critical application? Will it be used frequently and by a large number and variety of people? Is it connected to the network, which is generally an attractive access point for attackers, and is it used in production? Is it for internal use only, or is it also used live by third parties, such as customers and partners? It stands to reason that heavily used critical applications, open to other users and organizations, and used in production, are likely to be more vulnerable and will expose you to more risk, so you may wish to prioritize them.

6. Fix availability

A further prioritization parameter is whether a fix is available. This may sound rudimentary, but there’s little point in prioritizing a vulnerability — bringing it to the front of the queue, so to speak — if there’s no fix available. At that moment in time, there’s nothing you can do to remediate the situation until a fix has been released.

It’s all about the context

All of these considerations should be considered contextually when prioritizing vulnerabilities. Implementing context-based prioritization works because what might be a serious vulnerability in one use case can have less or no impact in different use cases. For instance, you may encounter a severe CVE that makes any Windows machine running a particular application exploitable to some remote code execution, and attackers can take over the application. However, you’re on Linux, so although the vulnerability is potentially hugely impactful for millions of Windows users, it has no detrimental effect on you because you’re not using that operating system. Its threat goes from high to zero instantly, depending on this different context.

On the other hand, a medium severity vulnerability that enables people to attack you on Linux machines should take higher priority because damage can be done to you, even though its severity is lower.

Applying a prioritization funnel

Your application security solution should have the capability to funnel and evaluate vulnerabilities through each of these considerations.

With a prioritization funnel, thousands of undifferentiated and unfiltered vulnerabilities enter the top of the funnel. As they move, through the numerous layers of consideration criteria, many are filtered out, such as those that are:

Mild in severity
Unexploitable or low in exploitability
Unreachable
Lightly or never used in your code base
Present low or no risk to sensitive data
Present only in internal applications,, bearing no threat to third parties
Ineffective in your operating environment
Already fixable

Now, all of these may indeed be vulnerabilities, but the salient point is that their threat to your particular code, software, and applications is negligible. From your perspective, they are false positives and they can be disregarded at best, or at least deprioritized in favor of those vulnerabilities that made it through to the bottom of the funnel. These are your true positives, those vulnerabilities that still pose a real, immediate, and serious threat. These are the vulnerabilities you should prioritize. Importantly, by applying this process, you end up with a clear sense of top priority vulnerabilities that should be immediately addressed.

Prioritization Funnel

The ideal prioritization tool

Your ideal prioritization tool will automatically perform this funneling process to your usage specifications. Most importantly, it will not rely solely on severity metrics, and it will apply a broader context to vulnerabilities in order to streamline and accelerate your security and remediation processes.

For example, Mend Prioritize applies priority scoring based on the range of considerations that you specify. The priority score enables you to make informed decisions and implement automated risk-based policies so that the biggest overall threats to your business are remediated first.

It operates with Effective Usage Analysis technology that scans open source components with known vulnerabilities to assess whether your proprietary code is making calls to the vulnerable component. And Mend Prioritize has a market-leading reachability engine that identifies not just when vulnerabilities are exploitable but can specify whether they’re reachable.

Mend goes beyond simply warning “No” with a red shield, like other tools. We highlight usable components and dependencies with a green shield in our UI. We tell you which are not reachable and which are therefore safe to use. Our research shows that only 15 percent to 30 percent of vulnerabilities are indeed effective, so you can drastically reduce the number of vulnerabilities you need to prioritize and target those that matter most.

The Risks and Benefits of Updating Dependencies

Guy Bar-Gil — Wed, 14 Dec 2022 13:59:12 +0000

One of the most important steps of securing your code base, your software, and your applications, is to update the dependencies they rely on. In principle, maintaining software health with updates demands that you use recent versions of any software and dependencies. Recent updates are less likely to be exploited and attacked via publicly known vulnerabilities than older versions, because with the latter, malicious actors have had more time to hunt for weaknesses. However, updating dependencies does present some risks as well as benefits. In this blog, let’s look at both, and what can be done to optimize your software and application security.

Main risks

There are two primary risks when you’re updating dependencies. The first is more common but less disruptive. The second is much less common but much more catastrophic.

When you update dependencies, you run the risk of breaking something in your code. This problem can arise because you don’t necessarily know if the new version of a dependency is backward compatible with the older version that you’ve been using. If it’s not, you may inadvertently disable your software by installing the new dependency. The fix is to uninstall it and revert to the older dependency or change your code to match the new dependency’s changes.
Second, and more serious, is that the new version has some sort of malicious code inside. It has been deliberately created or manipulated to either disrupt software and applications or gain access to the organization using them. Installing updates of this kind can result in you unknowingly infiltrating your code with malicious content that could gravely undermine it.

What can you do to decrease these risks?

In terms of breaking the build, make sure you have very good test coverage so that you can test every new version and see that each of them isn’t breaking anything. At Mend, what we enable people to do is to use what’s called a crowdsource test with our Merge Confidence feature within our free Mend Renovate tool. This minimizes risk when updating dependencies by identifying undeclared breaking releases based on analysis of test and release adoption data across our user base. It enables us to take into consideration all of the tests our users apply, and we aggregate those and give a percentage of those updates that pass and those that fail. Using a feature like this means that you don’t have to rely on your tests alone. Instead, you can rely on everybody who is using the Mend Renovate application, which covers over 500,000 repos on GitHub alone.

A second tactic for reducing dependency update risks is to aggregate package adoption data. So, for example, if you have 100,00 people who are using Lodash (a popular JavaScript-based library that facilitates web development) and the new version is, say 4.17.5, then we can tell you what percentage of people using Lodash are on that version. So, if you can see that twenty percent of people have already adopted this new version, then that’s a very strong indicator that it’s a healthy version that is unlikely to break anything.

How Merge Confidence works in Mend Renovate

For every dependency update, Merge Confidence opens up a new branch inside the repo. Then, when you’re opening up a new branch and creating a pull request to merge it in, it runs all of your tests, and you see if the pipeline is passing or failing your tests and those of other users. Having done this, it aggregates the percentage of pipelines that are passing the tests, it serves these results as a metric inside the pull requests that someone can review and see. So, for example, if only fifteen percent of pipelines are passing, that’s obviously a very bad indicator for this update. You can’t be confident about using it, so it isn’t a dependency you would want to take on without some serious manual review, or to be prudent, you would simply avoid it altogether.

On the other hand, if ninety-nine percent are passing, then that shows the update is safe and sound to use. Even if adoption numbers aren’t high, if the dependency is passing everybody’s tests, then the indications are that it’s not breaking anything, the change isn’t problematic and you can use it reasonably without too much manual reviewing. To make an even more informed decision, you can also check out the release notes about the dependency that Merge Confidence displays, so you can see its performance.

How to handle malicious updates

In addition to taking the action I’ve described; extra care is recommended when handling malicious updates. Whenever possible, it’s vital to detect and block malicious open source packages at the earliest opportunity before your developers can download them and before they can pollute your codebase with malicious activity. You can achieve this by deploying a malware scanner for open source packages, such as Mend Supply Chain Defender.

It’s also prudent to wait to take on updates, and don’t do so as soon as they’re released. How long you wait depends on your organization’s policies, and how safe it wants to be. Waiting ten days or twenty days is an acceptable time. By then, it’s likely to be known whether the update or package is secure. And if you can automate the process, all the better.

Why dependency update hygiene is like using a Roomba

I like to think about automating dependency updates like cleaning your home with the Roomba robotic vacuum cleaners. If you have the Roomba running all the time, then it’s going to clean your apartment so that it’s livable or sustainable. However, it’s not going to make it completely clean. To complement this, you’re still going to need to wash your floors once every so often, depending on how big your home is and how thoroughly you want it cleaned.

It’s the same thing with auto-merging or updating dependencies. If you’re auto-merging all the small patches for dependency updates, it’s a great way to stay up-to-date and keep your repo in a sustainable condition. That’s because you’re making it nimble enough so that you can handle an urgent problem like a zero-day vulnerability if one arises. You won’t find yourself wading through many months’ worth of update backlogs before reaching the urgent update. So, I like to think of auto-merging patches or minor dependency updates as happening in the background like cleaning your apartment with the Roomba, until you’re ready to get your mop and bucket out and thoroughly clean, which is the equivalent of reviewing major dependency updates and seeing if there are any new features that you want to include in your project. If you want to take on a major update, you need to solve all the backward incompatibilities. So, if API names have changed, for example, then you would need to manually change the names of the APIs that you’re calling. That remains a manual process.

Benefits of dependency updates

There are three main benefits to updating dependencies:

1. Vulnerability prevention. We recently examined npm CVEs, and we discovered that, in 2021, over ninety percent of them weren’t in the most recent version of dependencies. So, in principle, if you ensure that you always have all the most recent versions of dependencies, then you’ll automatically prevent ninety percent of the newly disclosed vulnerabilities. Some have no fix available, often in unmaintained projects. These aren’t going to work in very active projects, most of the time. So, the biggest benefit is that you’re avoiding vulnerabilities and this, of course, saves developers and the security team a lot of work.

2. New features. When you update dependencies, you get access to the software’s latest features and the latest APIs, as well as fresh bug fixes to protect your software. So, you’re simultaneously getting revised and updated capabilities while you’re keeping your software as secure as possible against the newest vulnerabilities and threats.

3. Protection against zero-day vulnerabilities. Maintaining dependency updates means that you’re better prepared to respond to urgent and unexpected security alerts, and you can be confident that your response will be fast, and effective and won’t itself break your code. If you’re regularly updating dependencies then you can simply apply security patches and you can do it immediately. On the other hand, if you don’t update dependencies, or do so only sporadically, then when there’s a sudden breach, it’s much more of a scramble to locate the breach and protect your code against it. In this scenario, it becomes a crisis, requiring urgent triage. Let’s say you haven’t updated your dependencies for a year and then you’re faced with a serious breach like Log4j. Suddenly you have to implement a year’s updates throughout hundreds of your applications, and you need to do it fast, without thorough testing to make sure nothing breaks. And you remain vulnerable while this is underway. The process is much slower and more prone to problems than if you frequently and regularly update your dependencies. Put simply, it’s best practice, which enables you to react quickly, decisively, and unproblematically.

What should you do with dependencies to get the best security?

1. Know and understand your dependencies The first thing about keeping dependencies up to date is that you have to know what a dependency is. Developers picture dependencies as open source packages or third-party libraries, but in reality, a dependency is anything that you use in your application that you didn’t create yourself. This includes open source packages, but it can also be Docker images that you’re basing your deployment on. It can also be code you’re using that was written by other teams. It can be Infrastructure as Code that you’re running in your application as a dependency. It can be Kubernetes manifest files and it can also be, of course, the source files, but this is much less common

2. Avoid unexpected dependency upgrades by using a lock file. A lock file locks all of your dependency versions in place, including the direct dependencies, such as those in a regular package file, but also the transitory or indirect dependencies. This way, you’re not getting any unexpected upgrades.

An alternative to using a lock file is to specify the range of updates that you will use. This means you set your system to use any update to dependencies between two points. For instance, you can specify that you will use any version of a dependency update between versions 1.1 and 2.0. If the most recent version now is 1.1 and version 1.2 comes out, you’ll automatically use that, because you have specified that this version is acceptable, so npm results will take the most recent version that fits the criteria that you define.

However, there are two problems with this. You could break something accidentally and you might not necessarily even know it because the dependent upgrade was automatic. Or something malicious might lurk in the update and you haven’t had time to review it and fix it.

So, locking files is a good practice, in general, because any time somebody builds an application, then they’ll be using the same dependencies that you’re using. Let’s say they clone the repo and build the same dependencies. It will work uniformly. There should be no issue with it working on one machine but not another.

However, this isn’t for transitive dependencies, just for the package files for the direct dependencies. If, for example, a downstream developer uses one hundred other libraries and all of them are pinning their dependency versions, the developer could end up with ten, or fifteen versions of the same dependency, and that’s cumbersome and confusing. It’s bad from an application-size perspective and also in terms of just like dependency management. It’s a huge burden. You don’t want to do that. On the other hand, if you’re writing something that’s not meant to be used downstream, such as on a web App. Then there are no problems

Therefore, it’s definitely best practice to use a lock file, and if you’re writing a web app, the best practice is to pin dependencies. For a library, you’ll probably want to consider using ranges to be more user-friendly

3. Use SBOMs. Then, all your components will be visible, and you will know which components will need updating and by when. This will enhance the security and maintainability of your codebase and will help you ensure that your projects are agile. As I mentioned earlier, any poorly maintained project with old dependencies that haven’t been updated in months or years will be behind the curve in terms of updates, which will make it very hard to respond quickly and effectively to sudden breaches that arise, like zero-day vulnerabilities.

4. Choose good dependencies. When you’re introducing a new dependency, it’s important that it’s healthy. So, you want to see that the last release wasn’t a long time ago, and that this project isn’t maintained at all. You want to see that there’s a decent cadence of commits coming into the repo and that it’s an active project. You want to see that issues and pull requests are being opened, and that there’s activity in the repo, not just commits, but also community activity. And security patches should be up to date to be sure that you have a secure dependency. You want to see that the maintainers care about that, and they’re actively applying security patches.

With this in mind, the necessity of updating dependencies can be illustrated in a simple analogy that we can all appreciate and that offers a compelling reason to do it, regularly and proactively:

“Updating dependencies is like going to the dentist. If you only go once every five years, it’s really going to hurt.”

Improving Your Zero-Day Readiness in JavaScript

Guy Bar-Gil — Tue, 15 Nov 2022 22:47:05 +0000

Data breaches are a massive issue. Beyond reputational damage and user data loss, financial costs must also be considered. With the need for extra staff, legal counsel, and even credit-monitoring services for those involved, the Ponemon Institute estimated the global average cost of a data breach in 2020 was $3.86 million. Given that, it’s clear investing in zero-day readiness should be top of mind for security engineers and developers alike.

What does “zero-day” mean?

“Zero-day” is a broad term that refers to an unpatched security flaw unknown to the public. It can also be a recently discovered vulnerability in the application. In either case, the developer has “zero days” to address and to fix it before it can be potentially exploited. Attackers make use of such flaws to intrude and to attack the system. Most times, these flaws are spotted by bounty hunters and are promptly patched. However, sometimes the attackers get there first and exploit the flaw before it is fixed.

In the context of web application security, a zero-day vulnerability is often used in cross-site scripting (XSS) and other types of attacks. Attackers take advantage of these vulnerabilities to inject malicious code into webpages viewed by other users. The code then runs on the user’s browser and can perform various actions, such as stealing sensitive information or redirecting the user to a malicious website (often owned by the attacker).

One of the most notable zero-day attacks was the 2014 attack on Sony Pictures Entertainment. Sony was the victim of a devastating cyber attack that led to the release of sensitive information, including employee data and financial records. The attackers used a zero-day vulnerability in the company’s network to gain access to its systems, which allowed them to steal large amounts of data. The Sony Pictures hack was a major wake-up call for many organizations, as it showed just how vulnerable they could be to cyber attacks, and how costly the reputational and financial damages were.

To better illustrate zero-day problems, let’s examine some zero day terminology and demonstrate with the circumstance of an SQL injection. This is a type of attack where the culprit uses SQL commands to steal or to manipulate data in SQL databases.

Zero-day vulnerability

A zero-day vulnerability is a security flaw in the software, such as the operating system, application, or browser. The vendor, software developer, or antivirus manufacturer has not yet discovered or patched this software vulnerability. Although the flaw might not be widely known, it could already be known to attackers, who are exploiting it covertly.

In the case of SQL injection, the vulnerability would be the lack of input sanitization. In this instance, the developer has skipped the step of validating the input data and verifying whether it can be stored or if it contains harmful data.

Zero-day exploit

If security is compromised at any stage, attackers can design and implement code to exploit that zero-day vulnerability. The code these attackers use to gain access to the compromised system is referred to as a zero-day exploit.

Attackers can inject malware to a computer or other device, using the exploit code to gain access to the system and bypass the software’s security by leveraging the zero-day vulnerability. Think of it like a burglar entering a home through a damaged or unlocked window.

In terms of SQL injection, the zero-day exploit is the code or manipulated input data the attackers use to infiltrate the vulnerable system.

Zero-day attack

A zero-day attack is when a zero-day exploit is actively used to disrupt or steal data from the vulnerable system. Such attacks are likely to succeed because there are often no patches readily available. This is a severe security drawback.

In SQL injection, the zero-day attack occurs when the exploit code is injected at avulnerable point in the software(where no input validation was done).

3 best practices for zero-day readiness

1. Always sanitize input data

Input validation is perhaps the most cost-effective way of improving application security. For any vulnerability to be exploitable, the attacker must first be able to bypass certain checks or validation. The absence of input sanitization is like leaving a door unlocked for the attacker to walk right through.

A solid regular expression (regex) can be designed to cover all the edge cases while validating an input.

For instance, if an input accepts a valid American mobile number, the validation can be performed as follows:

const number = /^(0|1|+1)?\s?\d{10}$/

if(input.match(number)){...}

The above regex considers all the corner cases and also makes it easier to write all the cases as a single expression.

This validation should not be done only on the client side. As a layer of added security, validation needs to be performed on the back end as well. Since the validation on the client side can be manipulated, always recheck for the validation on the server side before performing operations on it.

2. Intercept object behavior

If the attacker manages to bypass the validation, there needs to be some code that can still validate and handle data by default. This can be done using proxy objects.

Proxies are objects in JavaScript that intercept the behavior of other objects. This can be used either to extend or to limit a property’s usability.

To get a clearer insight, consider the code snippet below:

class Vehicle{
constructor(wheels,seats){
this.wheels = wheels
this.seats = seats
}
}

class Car extends Vehicle{
#secret_number
constructor(wheels,seats,power){
super(wheels,seats);
this.wheels = wheels
this.seats = seats
this.power = power
this.#secret_number = Math.random()*Math.random()
//Assume the secret number to be 0.052

}

getSecretNumber(){
const self = this
const proxy = new Proxy(this,{
get(target,prop,handler){
if(prop==='secret_number'){
return self.#secret_number*500
                        //returns 26
}
else {
return target[prop]
}
}
})
    return proxy['secret_number']
}
}

const car_1 = new Car(4,5,450)
car_1.getSecretNumber()

In the above snippet, the class “car” inherits from class “vehicle.”

The class “car” contains a method “getSecretNumber,” which is generated when an instance of the class is created. To avoid direct access, declare it a private field, and from the getter method, intercept the behavior (or value) using proxies.

In the example above, there is no need to use a proxy. However, what if there were a single getter function to return the desired value by passing an argument? In such cases, proxies would be of great benefit by masking or intercepting only the selected properties.

3. Maintain recent dependencies

Using external libraries in any JavaScript-based project is quite common because redeveloping an already created utility is a waste of time, energy, and resources. Additionally, many open source libraries have continuous support and user-contributed updates. While such dependencies provide numerous advantages, most developers find it quite painful to keep track of and to update them.

Security is one of the main reasons to be vigilant about dependency updates; automating the application of security patches can fix over 90% of new vulnerabilities before any issue is publicly disclosed.

Because many developers find tracking and updating dependencies so difficult, they automate the task using a tool like Mend Renovate. It is an open-source tool for developers that automatically creates pull requests (PRs) for dependency updates.

Renovate first scans the entire repository and detects all the dependencies used. It then proceeds to check if there are any updates available on those dependencies. If so, it raises a pull request containing relevant information, such as release notes, adoption data, crowdsourced test status, and commit histories. Insights into such data further help in deciding if an update needs to be merged or not.

Automating dependency updates could save up to 20% of your development time.

To get started, do the following:

Navigate to GitHub marketplace (linked above), and install the Renovate app.
Select all the repositories you would like Renovate to scan.
Before it starts scanning for updates, an onboarding PR is raised. This contains a preview of the actions Renovate will take. Merge it.
Leave the rest to Renovate.

Updating dependencies with Renovate minimizes the risk of breaking code during an update because the merge confidence is crowdsourced. This can be used to evaluate whether an update can be safely merged or if it contains potential risk.

In the case of zero-day vulnerability, rather than dealing with accumulated tech debt and deciding whether to jump a few major releases, you’ll only need to apply a security patch by updating to the next invulnerable version.

Updating dependencies is a crucial step in preventing security issues, and apps like Renovate can simplify the process and save developers time.

Conclusion

For any application, protecting its users’ data and other confidential information is of prime importance. While no system is 100 percent foolproof, vendors and software developers must always seek to find and to fix vulnerabilities before they’re exploited.

At the very least, there needs to be an incident response plan ready to minimize the impact if an attack does occur.

As the saying goes, however, an ounce of prevention is worth a pound of cure. For the best results, make sure you have implemented enough preventative measures to minimize the chance of exploitation in the first place.

Typosquatting Malware Found in Composer Repository

Guy Bar-Gil — Mon, 22 Aug 2022 09:58:56 +0000

If you’re familiar with modern PHP development techniques and practices you must certainly know and enjoy the benefits of composer when it comes to bringing in third party libraries.

With PHP being such a popular language on the web and composer so widely used, it is no surprise that attackers are doing their best to jeopardize it and take advantage of vulnerable applications out there.

Not too long ago there was a successful attempt against composer’s core: packagist.org.

And recently another type of attack was detected, in this case the problem was a typosquatting malware that got into the public repository.

This type of attack exploits the fact that people type fast and don’t usually review what the end result looks like or, if they do, they just take a quick look and move on.

For instance, if you wanted to get some content from a remote website from your terminal you could issue a command such as:

curl

But what if instead of curl you typed cyrl? Like this:

cyrl

In most cases this wouldn’t have any further consequences than getting a message like:

zsh: command not found: cyrl

So you’d realize your mistake, fix the typo and go on your way. But what if there actually was a cyrl command available in your computer? It would simply do whatever it was meant to do… the problem would be that, if it produced a somewhat similar output to what you expect from curl, it’d be very hard for you to notice the fact that you just ran something other than the command you meant to run.

Of course, this scenario is utterly extreme, since someone would have to have access to your personal computer and install such a binary… very unlikely.

In this case, the attack vector was a handcrafted package name uploaded into packagist.org (The main repository for composer packages).

Since packagist.org is a public repository, there’s not much control over what gets published, as long as it follows a few simple rules.

What the attacker did was look for a very popular package (Download stats are very easy to come by), create a malicious one with a very similar name and put it up for download right next to the original.

In this case, the fake package was called symfont/process (mimicking symfony/process).

You can see how the usage of such a similar name allows for many developers to download the malware instead of the actual package they were looking for by simply mistyping (Note how the t is right next to the y on your keyboard).

Once the package is downloaded it will provide its own implementation of Symfony\Process\Process which is not very much like the one designed by the Symfony team.

What this Process class does is send information about the host it’s running on back to a central location, to be used in further attacks.

The attack is completed by opening a webshell on the server infected with the malware which can be used by the attacker to execute random commands on the victim.

Of course the attack can only be successful if:

The victim’s application makes use of the Symfony\Process class
The part of the application that makes use of the Symfony\Process class is actually executed

It is safe to assume both conditions are met since the fact that the package is present as a project dependency means that there’s a need for it (Or the was a need for it in the past and nobody removed it from the composer.json file yet).

Still, in a case like this, there’s a small window of opportunity to avoid being a victim of such a threat between the time of downloading/installing the malware and actually allowing it to run its malicious code.

For more details you can read the original report by Sean Murphy (The researcher who discovered the problem) here.

According to the report, the attacker has been identified and the threat neutralized (If you search for the package symfont/process on the packagist site there’s no reference to it.

It’s worth mentioning a similar situation presented itself back in 2016 with the Monolog package as commented in this post by Jordi Boggiano.

In this case, the attack attempt was perpetrated via uploading a package named momolog/monolog instead of the popular monolog/monolog (A widely used package for error log handling).

It’s very interesting what Jordi did to encounter this suspicious package. He put together a script that would get the names of the vendors with of the most downloaded packages from packagist.org and compared the names of each to the others using the levenshtein function, which calculates the distance between two strings, effectively giving an idea of the similarity between the two.

For instance, running the following code:

levenshtein(‘monolog’, ‘momolog’);

Will produce a 1 as a result, while this:

levenshtein(‘monolog’, ‘monologer’);

Will produce a 2.

A levenshtein distance of one means that just by replacing one character for another (or adding or removing one) on a string you get the second.

This doesn’t necessarily mean that the similar vendor name is a bad actor, though it should definitely raise a flag.

After getting a short list of potential attackers it’s easy to go through it mannually looking for potential typosquatting attempts. It’s basically trying to spot those names that could be produced by hitting a key that’s right next to the correct one.

Finally it seems like the automation that was proposed by Jordi never got developed or deployed, otherwise, the symfont attack wouldn’t have been possible.

Typosquatting vs. Dependency confusion

There’s another somewhat similar type of attack known as dependency confusion. In the case of this attack, the target is not the individual developer but the package manager itself.

It’s a common practice for big companies to have their own private packagist where they store libraries that are only meant to be used internally.

Modern package managers have the intelligence to deal with version calculations to determine when a particular library is outdated. They do so by comparing the locally installed version numbers to the published ones.

This attack is performed by uploading to a public repository a package with the exact same name as the private one but with a higher version number.

This way, when someone asks their package manager to install the aforementioned library, they’ll inadvertently be downloading the malicious version.

In the case of composer, or better yet, packagist, there are a few measures in place to avoid becoming victims of such attacks. Among those you’ll find that:

Composer package names are of the form vendor/package. The vendor name is reserved for the first entity to upload a package. This means no attacker can hijack your company name and thus publicly upload malicious packages on your behalf… unless they beat you to upload the first package.
Up from version 2.0 of Composer, custom repositories take precedence over public ones.

You can read more details about Composer’s Dependency Confusion prevention measures here and, if you want to go deeper into understanding the nature of this threat and how it was discovered here’s the article by Alex Birsan, the researcher who first reported its existence.

Conclusion

While this particular attack is no longer a concern this kind of vulnerability could easily be replicated using another popular package as vector.

So, it’s probably a good idea to look twice at the composer require you’re issuing before hitting enter.

Or better yet, rely on automatic tools such as Mend Renovate to help you stay on the safe side.

Cloud-Native Applications and Managing Their Dependencies

Guy Bar-Gil — Thu, 04 Aug 2022 05:17:31 +0000

Many businesses are transitioning their operations to the cloud because it enables them to be more agile, respond faster to market changes, and scale their businesses more easily. However, moving to the cloud can also be a daunting task because it requires significant planning and execution. The term “cloud-native” refers to a methodology for developing and operating applications that makes use of the benefits offered by the cloud computing delivery paradigm.

What does it mean to be “cloud-native”?

Where applications are hosted is not important. What is important is how they are developed and deployed. Cloud-native development can be used with either public or private clouds. Any cloud storage is able to store and provide access from any of the public gateways in the world, regardless of where they are physically located. They are able to provide on-demand access to processing power as well as up-to-date data and application services for software developers. Cloud-native applications are developed using microservices, which are small, independent services that work together to form a larger application.

Microservices are a type of software architecture that allows developers to build and maintain large applications more easily. In a microservices architecture, an application is divided into small, independent parts called services. Each service can be developed and deployed independently, making the overall process more flexible and scalable.

Microservices can benefit the cloud infrastructure in several ways. First, they allow for more granular control over individual parts of an application. This makes it easier to deploy and manage applications in the cloud. Secondly, microservices make it easier to scale applications. When an application needs to be scaled up or down, only the services that need to be changed need to be updated. Finally, microservices can improve availability by allowing for rolling updates and deployments. If one service goes down, the others can continue to run, making the overall system more resilient.

When a data request is made, it is routed through a number of distinct Docker containers, each of which is running a separate set of microservices to provide service to the consumers. They are created with the intention of delivering commercial value that is widely acknowledged, such as the capacity to rapidly incorporate user feedback to achieve continuous improvement. Each container is responsible for the operation of a single service that is directed toward serving the customers. These containers are able to offer users scalability and the appropriate level of protection.

How do dependencies fit in?

A dependency is an implicit or explicit relationship between one piece of code and another. It can be thought of as a requirement that one piece of code has for another piece of code in order to function correctly.

There are two main types of dependencies: hard dependencies and soft dependencies. Hard dependencies are dependencies that cannot be changed without breaking the code that depends on them. Soft dependencies, on the other hand, can be changed without breaking the code that depends on them.

Dependencies can be either internal or external. Internal dependencies are dependencies between two pieces of code within the same software system. External dependencies are dependencies between two pieces of code that reside in different software systems.

In a cloud-native application, each microservice has its own dependencies. These dependencies are managed by the container in which the microservice is running. The container is responsible for ensuring that the correct versions of the dependencies are used and that they are kept up-to-date.

Since the development of those features from scratch would need a significant amount of time and because of the complexity of their design, it is much more efficient to use existing solutions. Because so many dependencies are required, solutions to manage them are also required and hence we have package managers such as Maven or NPM.

NPM, for example, calls for a wide variety of dependencies to be loaded into the container before it can be deployed. Due to the fact that many dependencies are open source, a variety of researchers have access to and uncover vulnerabilities in them, which is one of the many reasons why they receive updates.

Dependencies are of large concern to developers because, if neglected, they can become a security issue. If a developer is not careful, they can easily introduce vulnerabilities into their codebase by depending on code that has known vulnerabilities. This is why it’s important to scan third-party dependencies before installing them and apply security patches when they’re available.

For example, if we are talking about NodeJS, it typically gets updated once a month, and each of those updates fixes a couple of vulnerabilities. Thus, it is essential to update those systems on a regular basis to ensure that we can avoid as many dependency-related vulnerabilities as possible.

Best practices for dependency management

When we talk about dependency management, we talk about a lot of different strategies and things that are being considered, such as using an automated dependency management tool or a package manager. Nonetheless, to ensure that the dependencies are effectively managed, the following are a few of the best practices that can be utilized.

Detecting all the unused dependencies

You can use the depcheck to check whether or not there are any dependencies that are not being used. The following command needs to be used to install the depcheck.

npm install depcheck -g

Once installed, you can run the following command to check for unused dependencies.

depcheck

Detecting all the outdated dependencies

Most dependencies are open source and usually get updated once in a while as and when security researchers find vulnerabilities or a new functionality is added. Hence, it is possible that your dependencies get outdated. Thus, it is essential to verify and update outdated dependencies.

To check for outdated dependencies, you can just open the terminal by navigating to your NPM folder and run the following command:

You can also use a simple dependency checking script. It will check all the dependencies of a repo or package:

#!/bin/bash
DIRNAME=${1:-.}
cd $DIRNAME
FILES=$(mktemp)
PACKAGES=$(mktemp)
find . \
    -path ./node_modules -prune -or \
    -path ./build -prune -or \
    \( -name "*.ts" -or -name "*.js" -or -name "*.json" \) -print > $FILES
function check {
    cat package.json \
        | jq "{} + .$1 | keys" \
        | sed -n 's/.*"\(.*\)".*/\1/p' > $PACKAGES
    echo "--------------------------"
    echo "Checking $1..."
    while read PACKAGE
    do
        RES=$(cat $FILES | xargs -I {} egrep -i "(import|require).*['\"]$PACKAGE[\"']" '{}' | wc -l)
        if [ $RES = 0 ]
        then
            echo -e "UNUSED\t\t $PACKAGE"
        else
            echo -e "USED ($RES)\t $PACKAGE"
        fi
    done < $PACKAGES
}
check "dependencies"
check "devDependencies"
check "peerDependencies"

Keeping desired dependencies updated

Due to the wide variety of dependencies that are being utilized, it is necessary to ensure that the desired dependencies are kept up to date consistently to ensure the best performance. Checking and upgrading those dependencies manually typically takes a significant amount of time. Thus, a wide variety of organizations utilize automated dependency management tools to ensure their dependencies are kept up to date on a consistent basis and in a timely manner. The dependencies in the NPM application are defined in the package.json file of the repo. The files have this type of content:

{
  "name": "herodevs-packages",
  "version": "0.0.0",
  "scripts": {
    "ng": "ng",
    "precommit": "lint-staged",
    "start": "ng serve",
    "build": "ng build",
    "test": "ng test",
    "lint": "ng lint",
    "e2e": "ng e2e",
    "build-lazy": "ng build lazy",
    "build-dynamic": "ng build dynamicService",
    "npm-pack-lazy": "cd dist/loader && npm pack",
    "npm-pack-dynamic": "cd dist/dynamic && npm pack",
    "package-lazy": "npm run build-lazy && npm run npm-pack-lazy",
    "package-dynamic": "npm run build-dynamic && npm run npm-pack-dynamic",
    "package": "rm -rf dist/ && npm run package-dynamic && npm run package-lazy"
  },
  "private": false,
  "dependencies": {
    "@angular/animations": "^8.0.0",
    "@angular/common": "^8.0.0",
    "@angular/compiler": "^8.0.0",
    "@angular/core": "^8.0.0",
    "@angular/forms": "^8.0.0",
    "@angular/platform-browser": "^8.0.0",
    "@angular/platform-browser-dynamic": "^8.0.0",
    "@angular/router": "^8.0.0",
    "core-js": "^2.5.4",
    "rxjs": "~6.5.2",
    "zone.js": "~0.9.1"
  },
  "devDependencies": {
    "@angular-devkit/build-angular": "~0.800.0",
    "@angular-devkit/build-ng-packagr": "~0.800.0",
    "@angular/cli": "~8.0.2",
    "@angular/compiler-cli": "^8.0.0",
    "@angular/language-service": "^8.0.0",
    "@types/jasmine": "~2.8.8",
    "@types/jasminewd2": "~2.0.3",
    "@types/node": "~8.9.4",
    "codelyzer": "^5.0.1",
    "husky": "1.3.1",
    "jasmine-core": "~2.99.1",
    "jasmine-spec-reporter": "~4.2.1",
    "karma": "~3.0.0",
    "karma-chrome-launcher": "~2.2.0",
    "karma-coverage-istanbul-reporter": "~2.0.1",
    "karma-jasmine": "~1.1.2",
    "karma-jasmine-html-reporter": "^0.2.2",
    "lint-staged": "8.1.0",
    "ng-packagr": "^5.1.0",
    "prettier": "1.16.1",
    "protractor": "~5.4.0",
    "ts-node": "~7.0.0",
    "tsickle": "^0.35.0",
    "tslib": "^1.9.0",
    "tslint": "~5.11.0",
    "typescript": "~3.4.5"
  },
  "lint-staged": {
    "*.{ts,tsx}": [
      "prettier --parser typescript --writeprettier --parser typescript --write",
      "git add"
    ]
  }
}
Footer
© 2022 GitHub, Inc.
Footer navigation
Terms
Privacy
Security

Source

Using an automated dependency management tool

Automating dependency management can be helpful in a few ways. Not only can it speed up your development process, but it can also ensure that everyone on your team is using the same versions of dependencies. Automation tools work by looking at the dependencies you have declared in your code and comparing them to the versions that are available. If there is a newer version available, the tool will update your project to use it.

The changelog associated with the dependency is typically included in the pull request. You have a lot of different options to choose from when configuring the dependency management tool, such as the update time, which dependency has to be updated, what conditions need to be met if the pull request needs to be merged automatically, and many other things.

One example is Mend Renovate, which is an open-source tool that automatically creates pull requests for all types of dependency updates. Renovate is different from other dependency update tools because it is completely configurable and can be set up to automatically update dependencies on a regular basis, or only when there are new security updates. It offers features such as fully automated pull request creation and merging, dependency selection based on package popularity and testing data, support for multiple package managers including npm, yarn, composer, and customizable update rules for each repository.

Conclusion

In a cloud-native world, a typical environment is supported by a wide range of dependencies. Thoroughly testing these dependencies is critical to the success of any cloud-native application. However, it can be difficult and time-consuming to manually update all the dependencies. Automated dependency management tools can help to reduce the amount of time spent on managing dependencies and can also improve the quality of your code. In this post, we covered the best practices for managing dependencies in cloud-native applications.