Language Resources

DZone's Featured Languages Resources

Embracing Asynchrony in Java, Python, JavaScript, and Go

By Andrei Tetka

As a software developer with years of experience working primarily with Java, I found myself intrigued when I recently switched to Python for a new project. The transition prompted me to explore the world of asynchronous programming in various languages, including Java, Python, JavaScript, and Golang. This article is a result of my exploration and personal experience with these languages, aiming to provide insight into asynchronous programming techniques and examples. Asynchronous Programming in Java When I first started programming in Java, I quickly became familiar with the concept of threads. Over time, I found that the Executor framework and CompletableFuture class offered more powerful and flexible ways to handle asynchronous operations. For example, I used the Executor framework to build a web scraper that fetched data from multiple websites concurrently. By using a fixed thread pool, I was able to limit the number of simultaneous connections while efficiently managing resources: Java ExecutorService executor = Executors.newFixedThreadPool(10); for (String url : urls) { executor.submit(() -> { // Fetch data from the URL and process it }); } executor.shutdown(); executor.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS); Asynchronous Programming in Python Switching to Python, I was initially challenged by the different approaches to asynchronous programming. However, after learning about the asyncio library and the async/await syntax, I found it to be a powerful and elegant solution. I once implemented a Python-based microservice that needed to make multiple API calls. By leveraging asyncio and async/await, I was able to execute these calls concurrently and significantly reduce the overall response time: Python import aiohttp import asyncio async def fetch(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return await response.text() async def main(): urls = [...] # List of URLs tasks = [fetch(url) for url in urls] responses = await asyncio.gather(*tasks) asyncio.run(main()) Asynchronous Programming in JavaScript When working with JavaScript, I appreciated its innate support for asynchronous programming. As a result, I have used callbacks, promises, and async/await extensively in various web applications. For example, I once built a Node.js application that required data from multiple RESTful APIs. By using promises and async/await, I was able to simplify the code and handle errors more gracefully: JavaScript const axios = require("axios"); async function fetchData(urls) { const promises = urls.map(url => axios.get(url)); const results = await Promise.all(promises); // Process the results } const urls = [...] // List of URLs fetchData(urls); Asynchronous Programming in Golang During my exploration of Golang, I was fascinated by its native support for concurrency and asynchronous programming, thanks to goroutines and channels. For example, while working on a project that required real-time processing of data from multiple sources, I utilized goroutines and channels to manage resources effectively and synchronize the flow of data: Go package main import ( "fmt" "net/http" "io/ioutil" ) func processSource(url string, ch chan<- string) { resp, err := http.Get(url) if err != nil { ch <- fmt.Sprintf("Error fetching data from %s: %v", url, err) return } defer resp.Body.Close() body, err := ioutil.ReadAll(resp.Body) // Process the fetched data ch <- fmt.Sprintf("Processed data from %s", url) } func main() { sources := [...] // List of data sources ch := make(chan string, len(sources)) for _, url := range sources { go processSource(url, ch) } for range sources { fmt.Println(<-ch) } } Conclusion Asynchronous programming is a crucial aspect of modern application development, and having a deep understanding of its implementation across various languages is invaluable. My experiences with Java, Python, JavaScript, and Golang have taught me that each language has its unique and powerful features for managing asynchronous tasks. By sharing these experiences and examples, I aim to encourage others to embrace asynchrony in their projects, ultimately leading to more efficient and responsive applications. More

My First Firefox Extension

By Nicolas Fränkel CORE

A couple of weeks ago, I spent the weekend creating another CFP submission helper in the form of a Firefox extension. It was not a walk in the park. To help others who may be interested in doing the same (and my future self), here's my journey. Context I've written multiple posts about my conference submission workflow. To sum up: Everything is based on a Trello board. I created an app that registered a webhook on the board. When I move a conference from one lane to another, it starts or continues a workflow on the app side. I source the board by looking at websites, mainly Papercall and Sessionize, and manually copying conference data on cards. Two automation options are available: Automating conference sourcing Automating a card creation I thought long and hard about the first part. If I automate it, it will create a long list of Trello cards, which I'll need to filter anyway. I concluded that it's better to filter them before. However, I created the card manually by copy-pasting relevant data: name, dates, due date, CFP link, and website. It's precisely what a Firefox extension can help one with. Requirements and Design The user story is pretty straightforward: AS A: Lazy developerI WANT TO: Automatically add CFP data on Trello while browsing a web page on Papercall or SessionizeSO AS: To spend my time doing more fun stuff than copy-paste - My single user story My only requirement is that it needs to work with Firefox. My first idea is a button to trigger the creation, but I don't care much where it is: inside the page as an overlay or somewhere on the browser. In the first case, it should be a JavaScript injected client-side; on the other, a Firefox extension. I chose the second option because I needed to figure out how to achieve the first. I also wanted first to create my extension in Rust with WebAssembly. Spoiler: I didn't. A Simple Firefox Extension I had no clue about writing a Firefox extension, as this was the first time I did write one. My first step was to follow the tutorial. It explains the basics of an extension structure. Then, I followed the second tutorial. It explains how to create a pop-up menu for the extension but not how to interact with the web page. At this point, I decided to learn by doing, a technique that works well for me. A Firefox extension starts with a manifest. Here's the one from the first tutorial, simplified: JSON { "manifest_version": 2, "name": "Borderify", "version": "1.0", "content_scripts": [ { "js": ["borderify.js"] } ] } JavaScript document.body.style.border = '5px solid red'; I found the development feedback loop good. Imagine that you have followed the tutorial and created all the necessary files above. You can go to and click on the "Load Temporary Add-on" button. Then, point to your manifest file. Firefox loads the extension: it's now active. In the above example, the JavaScript from the tutorial adds a red border around every web page. It's useless, we can do better, but it shows how it works. We can change the script to change the color, e.g., from red to green. To make Firefox reload any change, including changes to the manifest, click on the "Reload" button on the temporary extension panel. Interacting With the Extension As I mentioned above, I want a button to trigger the creation of the Trello Card. Firefox allows multiple interaction options: direct trigger or opening of a pop-up window. I don't need to enter any parameter, so the former is enough in my case. Firefox allows multiple places to add buttons: the browser's toolbar, a sidebar, or inside the browser's URL bar. I used the toolbar for no reason other than because it was what the second tutorial demoed. Ultimately, it only changes a little, and moving from one to another is easy. Adding the button takes place in the manifest: JSON "browser_action": { "default_area": "navbar", #1 "default_icon": "icons/trello-tile.svg" #2 } Set the button on the navigation bar. For more details on the button location, please check the documentation. Configure the icon. One can use bitmaps in different formats, but it's so much easier to set an SVG. At this point, everything was fine and dandy. Afterward, I lost many hours trying to understand the different kinds of scripts and how they interact. I'll make it a dedicated section. Scripts, Scripts Everywhere The default language for scripts in extensions is JavaScript. However, depending on their location, they play different roles. Worse, they need to "talk" with one another. Let's start with the content-script I used in the above manifest.json. Content scripts are bound to a web page. As such, they can access its DOM. They run when Firefox loads the page. The script adds a red border around the web page's body in the tutorial. However, we need another kind of script: one to trigger when we click on the button. Such scripts should run along with the extension but can listen to events. They are known as background scripts. Background scripts are the place to put code that needs to maintain long-term state, or perform long-term operations, independently of the lifetime of any particular web pages or browser windows. Background scripts are loaded as soon as the extension is loaded and stay loaded until the extension is disabled or uninstalled, unless persistent is specified as false. You can use any of the WebExtension APIs in the script, as long as you have requested the necessary permissions. - background scripts Let's create such a script. It starts with the manifest - as usual: JSON "background": { "scripts": [ "background.js" ] } We can now create the script itself: JavaScript function foo() { console.log('Hello from background') } browser.browserAction.onClicked.addListener(foo) //1 Register the foo function as an event listener to the button. When one clicks the extension button, it calls the foo function. Debugging the Extension Let's stop for a moment to talk about debugging. I lost several hours because I didn't know what had happened. When I started to develop JavaScript 20 years ago, we "debugged" with alert(). It was not the best developer experience you could hope for. More modern practices include logging and debugging. Spoiler: I didn't manage to get debugging working, so I'll focus on logging. First things first, content scripts work in the context of the page. Hence, logging statements work in the regular console. Background scripts do work in another context. To watch their log statements, we need to have another Firefox developer console. You can open it on the extension panel by clicking the "Inspect" button. Communication Across Scripts Now that we know how to log, it's possible to go further and describe communication across scripts. Here's an overview of the overall flow: Let's change the code a bit so that background.js sends a message: JavaScript function sendMessage(tab) { browser.tabs .sendMessage(tab.id, 'message in from background') .then(response => { console.log(response) }) .catch(error => { console.error(`Error: ${error}`) }) } browser.browserAction.onClicked.addListener(sendMessage) Now, we change the code of content.js: JavaScript browser.runtime.onMessage.addListener((message, sender) => { return Promise.resolve('message back from content') }); Getting the Content So far, we have implemented a back-and-forth flow between the background and the content scripts. The meat is to get content from the page in the content script and pass it back to the background via a message. Remember that only the content script can access the page! The code itself uses the Document API, e.g., document.querySelector(), document.getElementsByClassName(), etc. Specifics are unimportant. The next issue is that the structure of Sessionize and Papercall are different. Hence, we need different scraping codes for each site. We could develop a single script that checks the URL, but the extensions can take care of it for us. Let's change the manifest: JSON "content_scripts" : [{ "matches": [ "https://sessionize.com/*" ], #1 "js": [ #2 "content/common.js", #4 "content/sessionize.js" ] }, { "matches": [ "https://www.papercall.io/*" ], #1 "js": [ #3 "content/common.js", #4 "content/papercall.js" ] }] Match different sites Scripts for Sessionize Scripts for Papercall Code shared on both sites At this point, we managed to get the necessary data and send it back to the background script. The last step is to call Trello with the data. Handling Authentication Credentials Using Trello's REST requires authentication credentials. I want to share the code on GitHub, so I cannot hard-code credentials: I need configuration. We can configure a Firefox extension via a dedicated options page. To do so, the manifest offers a dedicated options_ui section where we can provide the path to the HTML page: JSON "options_ui": { "page": "settings/options.html" } The page can directly reference the scripts and stylesheet it needs. The script needs to: Store credentials in the browser storage on save. Load credentials from the browser storage when the settings page opens. It's pretty straightforward with the provided example. My code is quite similar; it just needs three fields instead of one: JavaScript function saveOptions(e) { browser.storage.sync.set({ //1 listId: document.querySelector('#list-id').value, key: document.querySelector('#key').value, token: document.querySelector('#token').value, }) } function restoreOptions() { browser.storage.sync.get() //1 .then(data => { document.querySelector('#list-id').value = data.listId || '' document.querySelector('#key').value = data.key || '' document.querySelector('#token').value = data.token || '' }, error => { console.error(`Error: ${error}`) }) } document.addEventListener('DOMContentLoaded', restoreOptions) //2 document.querySelector('form').addEventListener('submit', saveOptions) //3 Uses the Firefox storage API Read from the storage when the page loads Save to the storage when the user submits the HTML form We also need to ask the storage permission in the manifest: JSON "permissions": [ "storage" ] We can now store the Trello credentials (as well as the required Trello list id) on the settings page: We can use the same storage API in the Trello calling code to read credentials. At this point, I was happy with my setup. I just added another round-trip from the background to the content to display an alert with Trello's card name and URL. Conclusion It was the first extension I wrote, and though the beginning was challenging, I achieved what I wanted. Now, I can navigate to a Papercall and a Sessionize page, click the extension button, and get the conference on my Trello board. It took me a couple of days and was fun; it was well worth it. I continue working on it to improve it bit by bit. The complete source code for this post can be found on GitHub. To Go Further Content scripts Background scripts webextensions-examples More

Pessimistic and Optimistic Locking With MySQL, jOOQ, and Kotlin

By Jakub JRZ

Simulating and Troubleshooting Thread Leak in Scala

By Ram Lakshmanan CORE

From Zero to Hero: Learning Web3 With Infura and Python

By Alvin Lee CORE

Getting Started With Astro

Web development, like most technologies, moves in waves and cycles. Static websites were all we had in the beginning. But, pretty soon, developers were hacking together the first server-generated sites thanks to Perl and PHP. This was the advance that would eventually kickstart frameworks like Laravel, Django, or Rails. Mobile devices would come to change how people consume the web. So long server-generated websites, hello client-rendered applications. The next wave brought frameworks that could give users a more app-like experience—without reloads—like React or AngularJS. While single-page applications provided a smoother experience, they have their drawbacks. Namely, longer page loads caused by all the extra JavaScript that has to be parsed and executed. Not to mention all the work it took to optimize for search engines. Astro is a prime example of the current wave we have been in since Next.js, taking us full circle: a web framework that combines server and client rendering to get the best of both worlds. What Is Astro? Astro is an open-source framework for generating web applications on top of popular UI frameworks like React, Preact, Vue, or Svelte. An Astro page is composed of several independent components. To improve load times, Astro strips away all JavaScript and prerenders pages in the server unless developers mark a component as interactive, in which case, Astro will send the minimum amount of JavaScript required for interactivity. Astro favors generating static and server-rendered sites without JavaScript in the browser. Developers can opt-in to client-side rendering for given routes or parts of the page. Thanks to this strategy, Astro pages load quickly, as no JavaScript needs to execute for the first render. In a process called hydration, Astro will “pour” JavaScript into components to make them dynamic. Getting Started With Astro To get started with Astro, install Node version 16.12.0 or higher and run the following command. Follow the on-screen wizard and choose to create an empty project when prompted: $ npm create astro@latest astro v1.9.1 Launch sequence initiated. ✔ Where would you like to create your new project? … awesome-website ✔ How would you like to setup your new project? › an empty project ✔ Template copied! ✔ Would you like to install npm dependencies? (recommended) … yes ✔ Packages installed! ✔ Would you like to initialize a new git repository? (optional) … yes ✔ Git repository created! ✔ How would you like to setup TypeScript? › Relaxed ✔ TypeScript settings applied! next Liftoff confirmed. Explore your project! Next, you can start the website in developer mode by entering the directory of the project you just created and running: npm run dev and visiting http://localhost:3000. Pages and Routes The interesting stuff in Astro happens inside the src folder. Checking what’s there, we see a single directory called pages with an index.astro file. Astro pages are a mix of HTML, Javascript, or TypeScript. This is the default index.astro: --- --- <html lang="en"> <head> <meta charset="utf-8" /> <link rel="icon" type="image/svg+xml" href="/favicon.svg" /> <meta name="viewport" content="width=device-width" /> <meta name="generator" content={Astro.generator} /> <title>Astro</title> </head> <body> <h1>Astro</h1> </body> </html> You may have noticed that Astro files start with fences delineated with ---. Whatever code we put inside these fences is executed on the server and prerendered before serving the page. Below the frontmatter, we find the content for this route, an augmented HTML form that allows for variables. We can, for example, define a variable in the frontmatter and use it in HTML like this: --- // src/pages/index.astro const title = "Astro"; --- <html lang="en"> <head> <meta charset="utf-8" /> <link rel="icon" type="image/svg+xml" href="/favicon.svg" /> <meta name="viewport" content="width=device-width" /> <meta name="generator" content={Astro.generator} /> <title>{title}</title> </head> <body> <h1>{title}</h1> </body> </html> Astro uses a file-based routing, so every file in the pages folder is mapped to a route in the website. For example, if we create a file called greetings.astro, we should see its contents at http://localhost:3000/greetings: --- const greeting = "Hello, world!"; --- <html lang="en"> <head> <meta charset="utf-8" /> <link rel="icon" type="image/svg+xml" href="/favicon.svg" /> <meta name="viewport" content="width=device-width" /> <meta name="generator" content={Astro.generator} /> <title>Astro</title> </head> <body> <h1>{greeting}</h1> </body> </html> In addition to .astro files, Astro can parse Markdown, MDX, JSX JavaScript, and TypeScript files. For instance, if we want to write a blog post in Markdown, we create a post.md file under the pages folder. Visiting the route will make Astro convert it into HTML on the fly: --- title: 'Learning Astro' pubDate: 2023-01-10 description: 'A post written in Markdown.' author: 'Tommy' --- # Learning Astro This Markdown file should be rendered as HTML when I visit http://localhost:3000/post Components Astro components are *.astro files with reusable code and HTML. We can use components to write elements like headers, footers, navigation bars, buttons, and forms—anything that can be expressed as HTML can constitute a component. Let’s create our first component in src/components/Header.astro: --- // src/components/Header.astro --- <meta charset="utf-8" /> <link rel="icon" type="image/svg+xml" href="/favicon.svg" /> <meta name="viewport" content="width=device-width" /> <meta name="generator" <title>Astro</title> Once defined, we can import it to any page (or other components) and use it like this: --- import Header from "../components/Header.astro"; --- <html lang="en"> <head> <Header /> </head> <body> </body> </html> Astro components are no different from pages. Any code defined between fences is executed on the server. JavaScript is stripped out before sending the content to the browser. Layouts Layouts are used to refine reusable UI structures. They are technically components, so the syntax stays the same. Let’s replace the content of index.astro with a layout: --- // src/pages/index.astro import SiteLayout from "../layouts/SiteLayout.astro"; --- <SiteLayout></SiteLayout> As you can see, layouts are stored in the src/layouts folder by convention. Layouts, like components, can include other components. Here, we have extracted the structure in index.astro and added a Footer component: --- // src/layouts/SiteLayout.astro import Header from "../components/Header.astro"; import Footer from "../components/Footer.astro"; --- <html lang="en"> <head> <Header /> </head> <body> <Footer /> </body> </html> Props and Slots So far, our website has been completely static. To pass data between pages and components, we need to understand how props and slots work. Components and layouts can define and accept props (short for properties) via the global Astro.props. Values passed through props are accessible to the component before rendering. We can read props in our component like this: --- // src/components/Header.astro const { title } = Astro.props; --- <meta charset="utf-8" /> <link rel="icon" type="image/svg+xml" href="/favicon.svg" /> <meta name="viewport" content="width=device-width" /> <meta name="generator" <title>{title}</title> The value of title can be supplied when the component is called, which, in the following example, happens through the SiteLayout layout: --- // src/layouts/SiteLayout.astro import Header from "../components/Header.astro"; import Footer from "../components/Footer.astro"; --- <html lang="en"> <head> <Header title = "Welcome my Astro Blog!" /> </head> <body> <Footer /> </body> </html> Note: you need spaces surrounding the equal sign, i.e., title="Hello" is NOT correct. Instead, it should be: title = "Hello". Slot elements create placeholders for content to be injected later. To see how it works, we can add a <slot /> element in src/layouts/SiteLayout.astro: --- // src/layouts/SiteLayout.astro import Header from "../components/Header.astro"; import Footer from "../components/Footer.astro"; --- <html lang="en"> <head> <Header title = "Welcome my Astro Blog!" /> </head> <body> <slot /> <Footer /> </body> </html> Now, the HTML inside <SiteLayout> is injected into the point where the slot is located: --- // src/pages/index.astro import SiteLayout from "../layouts/SiteLayout.astro"; --- <SiteLayout> <p>This content is rendered in the slot</p> </SiteLayout> Nesting components, layouts, props, and slots give us the flexibility to create reusable UI files across the website. Islands and Hydration Up to this point, we have not shipped any JavaScript to the user; everything is prerendered and served as pure HTML+CSS. How do we make Astro send JavaScript to the browser? For that, we need to understand the islands architecture. The island architecture pattern aims to reduce the amount of JavaScript needed browser-side. Less JavaScript means less data sent and less computational power required on the user’s device. An island is an autonomous component that bundles HTML, CSS, and—optionally—JavaScript. In the islands pattern, a page is made up of several independent islands. The islands architecture allows combining static HTML, server-rendered content, and interactive client-rendered components on one page without conflicts. Each island is prerendered, so there is no interactivity right after the page loads. Once the initial page is ready, the islands are converted into interactive content in a process called hydration. Hydration is a technique that converts static content delivered via static hosting or server-side rendering into a dynamic page by attaching event handlers to the HTML elements. The first contentful paint happens right after all the HTML has been sent to the browser, which typically happens very quickly. The browser then requests and executes all the client-side JavaScript needed to make the page fully interactive. Using Islands An example showing how Astro implements islands is in order. Let’s first add a UI integration to the project. The following command installs @astrojs/preact and preact: $ npx astro add preact Let’s create a simple button to test the integration: // src/components/MyButton.jsx export default function MyButton() { const clicked = () => { console.log('Hello!') }; return ( <div> <button style={{ color: 'purple' } onClick={clicked}>Click me</button> </div> ) } As usual, Astro will try to strip away any JavaScript. So, nothing would happen if we instantiated the component with <MyButton />. We need to tell Astro we want this component treated as an island and hydrated accordingly by adding the template directive client:load: --- import MyButton from "../components/MyButton.jsx"; --- <html lang="en"> <body> <MyButton client:load /> </body> </html> Clicking the button should print “Hello!” in the browser console. The client directive caused Astro to hydrate the component. There are five levels of hydration with different priorities: client:load hydrates the component as soon as possible. client:idle hydrates the component when the page is done loading. Useful for low-priority components that don’t need immediate interactivity. client:visible={string} hydrates the component as soon as it appears in the viewport. client:media={string} takes a CSS query as an argument and loads the component as soon as it’s fulfilled. client:only skips HTML rendering entirely and renders the component in-browser. Conclusion The popularity of this project has been nothing short of astronomical: in the first five months since the Astro 1.0 release, the project has amassed over 25,000 stars on GitHub. The success of Astro is not accidental. The Astro team has made a web framework that offers great ergonomics for developers and fast-loading websites for users, even if they are on low-power devices or slow connections. Thanks for reading, and happy building!

By Tomas Fernandez

Reconciling Java and DevOps With JeKa

If you've ever implemented a Java project using a mainstream build system such as Ant, Maven, or Gradle, you've probably noticed that you need to use extra language to describe how to build your project. While this may seem appealing for basic tasks, it can become trickier for more complicated ones. You need to learn a specific soup of XML, write verbose configurations, or write Kotlin DSLs that are intertwined with complex tooling. If you've gone further by writing pipelines for deploying and managing releases, you've probably had to write shell or Groovy scripts to be used in your CI/CD tools. While this may be okay for simple tasks, it can become cumbersome when dealing with complexity. You'd prefer to use all your knowledge and tooling when producing regular code, such as modeling, refactoring, and run/debug in IDEs. This is where JeKa comes in. JeKa is a very thin tool that allows you to execute arbitrary Java source code from the command line or within your IDE. While this may not seem like a big deal at first glance, this capability enables you to: Write any arbitrary script using plain Java code, run and debug it in an IDE, and invoke arbitrary public methods so you can host many scripts in a single class. Invoke this code from the command line or any CI/CD tool without needing to compile it. JeKa handles the compilation for you. Simply use any library available in the Java ecosystem in your scripts. Just declare dependencies in annotations, and JeKa will resolve them behind the scenes. With this capability, you can get rid of cryptic shell scripts and implement powerful and portable scripts without needing additional knowledge. The second stage of JeKa is the utilities it embeds. When writing scripts, you can use any libraries, but JeKa also bundles some utilities that are frequently needed when implementing automation tasks, such as dealing with file sets and zip files, Git, launching OS processes synchronously and retrieving results, Java compilation/testing, Maven dependency/repo management, full JVM project build models, and XML handling. These utilities can help you implement CI/CD pipelines or even build/test entire Java projects. The last stage consists of a plugin and parameterization mechanism that allows JeKa to be a first-class citizen in the build tool space. Each plugin provides methods and configurations to integrate external technology with a minimum of effort or no typing. Currently, there are plugins for JVM projects, Node.js, Spring Boot, SonarQube, JaCoCo, Kotlin, Protocol Buffers, and Nexus repositories. With all these capabilities, JeKa lets you implement an entire Java project with automated delivery using a single language for everything. This language can be Java, or it can be Kotlin, as JeKa provides the same capabilities for both. Additionally, an IntelliJ plugin exists to improve the user experience with JeKa. For a better understanding, check out this GitHub repository that demonstrates numerous projects built with Jeka. Through this, you'll gain insights into how Jeka can be utilized to build projects with popular technologies like Spring Boot, Kotlin, Node.js, SonarQube, and JaCoCo. Jeka also provides detailed documentation describing exactly how it works. You won't be left to your own devices using it. What do you think about this initiative? Do you think JeKa can ease the full development-delivery cycle?

By jerome angibaud

A Beginner’s Guide To Styling CSS Forms

To put it in simple terms, an HTML, or web forms, are referred to as web elements that are designed to enable users to put in their information, which may include their names, age, gender, credit card number, etc., which is then sent to a server for processing. Web forms are very useful and are now a very important aspect of web development. Imagine having to travel to a foreign country just to fill out a form because you want to apply for a course in their university. Since every modern university has a website with a form, students can sit in the comfort of their homes and apply within their convenient time, saving them the time to visit the school in person. Next, the school collects this information to decide if the candidate is qualified to study at their university. Web forms are not limited to schools only, businesses, such as banks and e-commerce, to mention a few, use web forms to collect information from their customers. This helps them decide how to serve the needs of their customers better. This is exactly what web forms are designed to do—collect information for processing. In this tutorial on CSS Forms, we will take a closer look at how to style forms with CSS and much more. Prerequisites for Styling CSS Forms By the end of this tutorial, you will be in a position to build a form in HTML that is styled using CSS. However, a basic knowledge of HTML and CSS is required to understand this article. Here is a sample of the finished project of what we’ll be building. Here is the link to CodePen for this styling CSS Form project’s source code. Creating HTML Boilerplates Let’s start by creating the website boilerplates (that is, the HTML code structure for the website). This contains the head and body tags, as seen below: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta data-fr-http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="viewport" content="width=device-width,initial-scale=1.0"> <title>How to style forms with CSS: A beginner's guide</title> <link rel="stylesheet" href="main.css"> </head> <body> </body> </html> After creating our HTML boilerplate, save it as an index.html file. I am using VS Code but you can use any IDE of your choice. Creating the Necessary HTML Tags Now, let’s create the necessary HTML tags for our styling CSS Forms project: class="site__container"> <main class="hero__images"> <main class="card__wrapper">  <section class="card__forms">  <section class="items__wrapper"> <div class="site__logo">MS</div> <div class="sign__caption"> <p>Signin for home delivery service</p> </div> <div class="user_id">  <input type="text" name="Username" placeholder="Username"> <input type="password" name="Password" placeholder="Password"> </div> <div class="checkbox__wrapper">  <input type="checkbox" name="checkbox"> <label for="checkbox">stay signed in</label> <a href="#">Forget Password?</a> </div> <div class="btn__wrapper">  <button class="btn__signin">sign in</button> </div> <div class="signup__option">  <p>Don't have an account yet? <a href="#">Sign Up!</a></p> </div> </section> </section> </main> </main> </div> </body> From the code sample above, let’s look at what each tag is supposed to do based on the class names assigned to them: site__container: This class is assigned to a div tag that wraps around every other tag within our HTML tags. hero__images: This class is assigned to the main tag. This tag is where our hero image will be assigned using CSS. card__wrapper: This class is assigned to another main tag nested inside the hero__image tag. This tag wraps around all tags that make up our web form. card__forms: This class is assigned to the section tag, which is the main tag for our web form Items__wrapper: This tag wraps around the div, input, button, and link tags, which are items within the web form. site__logo: This is the site logo. sign__caption: This tag helps inform the user why they should sign up/sign in using the web form. user_id: This wraps around the input tag where the user has to enter their username and password. checkbox__wrapper: This wraps around the input, a (anchor), and labels tag. Here, we ask the user if they would like their user id to be saved so they don’t have to retype them later when they visit the site by clicking the checkbox. We also ask if they have a forgotten password that needs to be recovered. btn__wrapper: This wraps around the main button of the form. This is the button the user clicks on that helps them sign into the site. signup__option: This tag wraps around the paragraph tag and a link tag. Here, we provide an option for new users who don’t have an account to signup. Now that we have the HTML boilerplate setup, save it, and run it on your favorite browser. The code in this CSS Forms tutorial is run using Google Chrome. Browser Output Your code should look like this. You will notice how plain and simple the website is from our browser output. This is because we have not added the CSS yet. In the next section of this tutorial, we’ll talk about this. Styling Common Form Elements With CSS Form elements are some of the most common elements on a website. Every site must have these elements, from login forms to search boxes and buttons, to be functional. These elements are sometimes overlooked in design and styling, which may cause them to blend into each other, making your site’s UI look dull. A good way to avoid this is by using CSS to change the appearance of form elements such as text fields and buttons. Here’s how you can style common CSS form elements: 1. In your Visual Studio Code, create a new file and name it main.css. This is going to be the CSS file. Notice from our HTML file that, in the head tag, we have a link tag that points to main.css. This link tag helps us link together the HTML and CSS files. Any changes to the HTML will affect the CSS file and vice versa. 2. Let’s add the relevant CSS code in main.css so we can style the boring page we created earlier using HTML. 3. Your editor should open. Now, let’s write some CSS. Applying Universal Selector Universal selector is a CSS concept that lets you define styles once and reuse them across the entire website. It means you need to define only that “one thing” once, and then you can use it on other pages. This saves us a lot of time and makes the code more maintainable. Type and run the code below for your CSS Forms project: *, *::after, *::before { padding: 0; margin: 0; box-sizing: border-box; } From the CSS code above, we use a CSS universal selector to target all the elements on the webpage. We added a padding 0px, margin 0px and box-sizing of the border-box. This helps remove all the white spaces on the webpage so we don’t have unnecessary white spaces interfering when styling CSS Forms. Applying Viewport Width (VW) and Viewport Height (VH) Applying viewport width (vw) and viewport height (vh) are new properties available in CSS3 and have quite a few uses. By default, the vw unit is equal to 1% of the containing element’s width. The same goes for vh. Using these properties, you can do some cool things with your website: .site__container { width: 100vw; height: 100vh; } We target a div tag with a class of .site__container and we assign a width and height of 100vw and 100vh to both, respectively. This helps size our webpage’s viewport to take the browser’s full width and height. You will not see any effect when you refresh your browser since other HTML tags have not been given specific sizing or styling. Applying Hero Image The hero image is a common element in blog design. It’s a large, eye-catching image that spans the full width of the page and often a bit of the page’s height. It is usually used to draw initial attention to an article or page and as an anchor point for future articles or pages within that site section. A hero image can also highlight content, such as images, videos, or other interactive elements, by making it the central point of focus on the page: .hero__images { height: 100%; background-image: url("./delivery-man.jpg"); background-repeat: no-repeat; background-size: cover; background-position: center; } Browser Output From the CSS code above, we assigned a height of 100% to the class of hero__images. This helps the hero__images class inherit the same height value set to the direct parent, which is 100vh. This helps the background image used to occupy the browser viewport, and then we set a background-image. We also added a background-repeat of no-repeat to prevent it from repeating, a background-size of cover, which helps set the image to cover the entire viewport, and a background-position of center, which helps center the image within the viewport or container. Centering the Form With Flexbox Centering the form with CSS Flexbox is easy. The only problem is it requires you to wrap the form in a parent container with a specified width, which is a bit of a pain. Thankfully, there’s a better solution that works in all browsers. You’ll need three elements: The actual form. A wrapper element (parent). An element for the actual content (child). We’ll use CSS Flexbox to center the web form to the browser center. Type and run the code below: .card__wrapper { height: 100%; display: flex; justify-content: center; align-items: center; } Browser Output In this section, we target the card__wrapper class, we set a height of 100%, a display of flex, justify-content of center, and align-items of center. This helps position the form to the center horizontally and vertically while styling CSS Forms. Styling the Form Element The HTML for a typical form consists of various input elements, each representing a different type of data. In CSS, you can style the input element in various ways to create distinction among them. Here we apply styling to the CSS form and add a specific width and height: .card__forms { display: flex; justify-content: center; width: 400px; height: 400px; background-color: rgb(1, 32, 32, 0.4); border-radius: 0.5rem; box-shadow: 3.5px 3.5px 4px 2px rgba(0, 0, 0, 0.3); border-top: 2px solid rgb(89, 250, 156); border-bottom: 2px solid rgb(89, 250, 156); } Browser Output We target the class of card__form, apply a display of flex, justify-content of center, width, and height of 400px across, to give it a defined size, background-color of rgb (1, 32, 32, 0.4). The last two integers of 0.4 are transparent values as they represent the opacity of the 0.4 value. We also added a border-radius of 0.5em, box-shadow, border-top, and border-bottom of 2px solid rgb(89, 250, 156). This creates the solid lime color you can see at the top and bottom of our CSS Form. Styling Form Logo Many websites use the form logo to style the input field and submit button in a form. The first reason is the form is consistent with the overall design. The second reason is it makes it easier to differentiate between a regular input field and a submit button since the color is used for the submit button. Here we apply styling to the logo on the form element: .site__logo { width: 40px; padding: 4px; margin: 2.0rem 5rem; text-align: center; border-radius: 50%; font-size: x-large; font-weight: bolder; font-family: 'Trebuchet MS', sans-serif; background-color: rgb(89, 250, 156); color: rgb(1, 32, 32); cursor: default; } Browser Output We targeted the site__logo class and we added a width of 40px, padding of 4px, margin of 2.0rem and 5rem for top and bottom, respectively (to add extra white space). We also apply text-align to center (to center the logo), border-radius of 50% (to make the logo round), font-size of x-large, font-weight of bolder, and font-family of “Trebuchet MS.” And a background-color of rgb (89, 250, 156), color of rgb (1, 32, 32) and cursor of default. Styling Site Caption The site caption is a little bit of text that appears at the top of every page on your website. This can be any text you want. It is typically used to identify who created the site and possibly provide legal information about the site’s content. By styling this text, we can make it stand out more or appear in multiple places on a page. Here we apply styling to the caption on the CSS Form: .sign__caption p { color: white; font-family: calibri; font-style: italic; text-transform: lowercase; margin-bottom: 1.5rem; } Browser Output We selected the sign__caption class and targeted the p tag inside it. We apply a text color of white, font-family of calibri, font-style of italic, text-transform to lowercase, and margin-bottom of 0.5 rem (to apply extra white space at the bottom). Styling the Input Tag The input tag comes with a few styles by default. It has the look of a text field, and it’s a good idea to use the default styling for the most part. The default styling provides enough contrast between elements so users can easily read and understand what they’re filling in. Here we apply styling to the input tag on the CSS form, where users can enter their information: .user_id input { width: 100%; display: block; outline: none; border: 0; padding: 1rem; border-radius: 20px; margin: 0.8rem 0; color: rgb(1, 32, 32); } .user_id input::placeholder{ color: rgb(1, 32, 32); } .user_id input:active { outline: 2px solid rgb(89, 250, 156); } Browser Output We apply the following values from the code sample above to the input tag nested inside the user id class: width: of 100% (so our input tag takes in a full size within the container). display: of a block (so the tag can be centered properly). outline: of none (to remove the outline around the input tag when we click on it). border: to 0px (to remove the gray border around the input tag). padding: of 1rem (to add more space within the tag input tag to give room for user’s input such as usernames and passwords). border-radius: of 20px (to give it a rounded curve at the edge). margin: of 0.8rem 0 (0.8rem added extra space at the top and bottom while the 0 means no space should be added to the left and right of the input tag). color: of rgb (1, 32, 32). For the placeholder, we added a text color of rgb (1, 32, 32), which is responsible for the “Username” and “Password” text. And for the active state, we added an outline color of 2px solid rgb (89, 250, 156). You will see the outline color when you click on the input field of the CSS form. Styling Forget Password It is necessary to style the password field and its labels in the forgot-password form. You may require use for this purpose a combination of standard CSS properties and some custom properties. Here, we apply styling to the label and a tag, providing two options for users who want their account to remain signed in and for users who forgot their password to recover it: .checkbox__wrapper label { color: white; font-family: calibri; text-transform: lowercase; } .checkbox__wrapper a { color: rgb(89, 250, 156); font-family: calibri; text-transform: lowercase; text-decoration: none; font-style: italic; } .checkbox__wrapper a:hover { color: rgb(255, 255, 255); font-family: calibri; text-transform: lowercase; text-decoration: none; font-style: normal; } Browser Output In this section, we targeted the label tag nested inside the .checkbox__wrapper class and applied the following styling to it. color: of white, a font-family of calibri. text-transform: of lowercase, while on the anchor tag. color: of rgb (89, 250, 156). text-decoration: to none (to remove the default blue line on the anchor tag). font-style: to italic to differentiate it from the label text. Since the anchor tag is a link that is meant to send a request, we decided to add a hover state, which is something to notify the user that this is a clickable link. On a hover state, we added a text color of rgb (255, 255, 255), and the font-style is set to normal to restore it. Style the Form Button The form button is the first thing a user will see on your website. A nice button can make a good impression, but a bad one can leave a user with a bad taste in their mouth before they even get to read any content. Here we apply styling to the button tag on CSS form. This button enables the user to sign into the website: .btn__wrapper button { width: 100%; border: none; padding: 1rem; border-radius: 20px; text-transform: uppercase; font-weight: bolder; margin: 0.8rem 0; color: rgb(1, 32, 32); } .btn__wrapper button:hover { background-color: rgb(89, 250, 156); color: white; transition: all 0.5s ease-in-out; cursor: pointer; } Browser Output In this section, we targeted the button tag nested in the btn__wrapper class, and we applied: width: of 100% to make it have a full width within the container. border: is set to none to remove the gray border around the button. padding: of 1rem to add space between the “SIGN IN” text and the button tag. border-radius: of 20px to apply a round corner style to the border. text-transform: is set to uppercase to capitalize the text. font-weight: is set to bolder to make text bold. margin: is set to 0.8rem at the top and bottom to give white space around the object, while 0 at the left and right. color: of rgb (1, 32, 32). On hover, we set the background-color to rgb (89, 250, 156), text color to white to create a kind of invert effect when we hover over it, a transition of all 0.5s ease-in-out, and a cursor of a pointer. To see these changes, move your mouse pointer to hover on the button. Styling the Signup Option The actual signup option is styled to look like a button. The design of the button is simple and recognizable so users will know what it does. The input size for the email address is a bit smaller than usual to ensure the user doesn’t have to scroll up and down every time they want to add an email address. Here we apply styling to the p and a tag, where we provide options for users who don’t have an account yet but want to sign up: .signup__option p { color: white; font-family: calibri; text-transform: lowercase; } .signup__option a { color: rgb(89, 250, 156); font-family: calibri; text-transform: lowercase; text-decoration: none; font-style: italic; } .signup__option a:hover { color: rgb(255, 255, 255); font-family: calibri; text-transform: lowercase; text-decoration: none; font-style: normal; } Browser Output From the browser output, you will notice that the “stay signed in / forgot password” and “don’t have an account yet? / sign up!” looks the same. Well, you guessed it right! We have to copy the CSS styles for the label tag of the checkbox__wrapper class and paste it on .signup__option a and then copy the styles on forget password and paste it on the sign up class. Now we should have the same effect. Here is the link to the finished project on styling CSS forms. Summary You have learned how to style forms using CSS. You also learned how to center items using Flexbox, using transition on buttons, apply background images, and how to cross text your website or web app using LambdaTest. Alright! We’ve come to the end of this tutorial. Thanks for taking your time to read this article to completion. Feel free to ask questions. I’ll gladly reply.

By Alex Anie

Practical Example of Using CSS Layer

Let’s say we are developing a component library. Let’s say we are using React. Let’s say it has a button component. Conventionally, it will look like this: JavaScript // CustomButton.js import './CustomButton.css' const CustomButton = ({ children, className = '' }) => { const customClass = 'CustomButton' + className; return <button className={customClass}>{children}</button> } And the styles will look like this: CSS /* CustomButton.css */ .CustomButton { background: aquamarine; padding: 4px 12px; border: none; } Here, the developers come to us and say: “The button is great, but we need a link in the form of a button!” No problem! Let’s add a property: JavaScript // CustomButton.js import './CustomButton.css' const CustomButton = ({ children, Component = 'button', className = '' }) => { const customClass = 'CustomButton' + className; return <Component className={customClass}>{children}</Component> } Example of use: JavaScript const linkCustomButton = ( <CustomButton Component="a" href="https://google.com"> Google.com </CustomButton> ); Oops, "a" is an inline element. What’s the matter, you ask? The button looks the way it did. Not exactly! Imagine that users want to add space above the button: JavaScript // App.js <CustomButton className="myCustomButton">Click it!</CustomButton> CSS /* App.css */ .myCustomButton { margin-top: 10px; } The inline element will ignore this margin. That’s okay! We’ve been in worse jams than this. Let’s change the style of the button: CSS /* CustomButton.css */ .CustomButton { background: aquamarine; padding: 4px 12px; border: none; display: inline-block; } No, there is another problem, much more vile and intractable. Watch the hands! Imagine the following way of using a button in a project. Let’s say there is some kind of loading state, and if it is true, the button needs to be hidden: JavaScript // App.js <CustomButton className={loading ? 'hidden' : ''} /> The problem is that .hidden and .CustomButton have the same weight, and they both claim the display property. This means that the CSS parser will have to figure out the winner while being guided by the order. The order is the last tier of the cascade. When your CSS files are broken into modules, you cannot rely on their specific order of appearance in the final bundle. As a result, this will lead to situations in which one or another selector wins. So, what do you do? Layer to the Rescue The cascade Layer is located exactly before the Specificity and Order: Importance Context Layer (hi) Specificity Order Layer allows you to flexibly set different levels of your style sheets. In our example, there may be three such levels (or maybe more): A reset style sheet A library-style sheet An application-style sheet Let’s set these levels. CSS @layer reset, library; /* CustomButton.css */ @layer reset { .CustomButton { display: inline-block; } } @layer library { .CustomButton { background: aquamarine; padding: 4px 12px; border: none; } } But wait: we wanted to set three levels, yet we set only two. This is specific to the layer cascade. Anything defined outside of any level automatically becomes the highest priority. We don’t need library, too, but I used it for clarity. It’s cool that all non-level styles win because users don’t have to wrap their overriding styles in some @layer app. That is, this style will eventually override the one in @layer reset. CSS /* App.css */ .hidden { display: none; } Is it convenient? I think it’s just fantastic, largely because browsers still don’t have this cascade :-( Yet.

By Alexey Shepelev CORE

Simulating and Troubleshooting BLOCKED Threads in Kotlin [Video]

In this latest article of the series about simulating and troubleshooting performance problems in Kotlin, let’s discuss how to make threads go into a BLOCKED state. A thread will enter into a BLOCKED state if it can’t acquire a lock on an object because another thread already holds the lock on the same object and doesn’t release it. Kotlin BLOCKED Thread Program Here is a sample program that would make threads go into a BLOCKED state. package com.buggyapp class BlockedApp { fun start() { println("BlockedApp:started") for (counter in 0..9) { // Launch 10 threads. AppThread().start() } } } class AppThread : Thread() { override fun run() { AppObject.something } } object AppObject { @get:Synchronized val something: Unit get() { while (true) { try { Thread.sleep(6000000.toLong()) } catch (e: Exception) { } } } } fun main() { println(BlockedApp().start()) } The sample program contains the BlockedApp class. This class has a start() method. In this method, 10 new threads are created. In the AppThread class there is a run() method that invokes the getSomething() method on the AppObject. In this getSomething() method, the thread is put to continuous sleep; i.e., the thread is repeatedly sleeping for 10 minutes again and again. But if you notice, the getSomething() method is a synchronized method. Synchronized methods can be executed by only one thread at a time. If any other thread tries to execute the getSomething() method while the previous thread is still working on it, then the new thread will be put in the BLOCKED state. In this case, 10 threads are launched to execute the getSomething() method. However, only one thread will acquire a lock and execute this method. The remaining 9 threads will be put in a BLOCKED state. NOTE: If threads are in BLOCKED state for a prolonged period, then the application may become unresponsive. How To Diagnose BLOCKED Threads You can diagnose BLOCKED threads either through a manual or automated approach. Manual Approach In the manual approach, you need to capture thread dumps as the first step. A thread dump shows all the threads that are in memory and their code execution path. You can capture a thread dump using one of the 8 options mentioned here. But an important criterion is: You need to capture the thread dump right when the problem is happening (which might be tricky to do). Once the thread dump is captured, you need to manually import the thread dump from your production servers to your local machine and analyze it using thread dump analysis tools like fastThread or samurai. Automated Approach On the other hand, you can also use the yCrash open source script, which would capture 360-degree data (GC log, 3 snapshots of thread dump, heap dump, netstat, iostat, vmstat, top, top -H, etc.) right when the problem surfaces in the application stack and analyze them instantly to generate root cause analysis report. We used the automated approach. Below is the root cause analysis report generated by the yCrash tool highlighting the source of the problem. yCrash reporting transitive dependency graph of 9 BLOCKED threads yCrash prints a transitive dependency graph that shows which threads are getting BLOCKED and who is blocking them. In this transitive graph, you can see "Thread-0" blocking 9 other threads. If you click on the thread names in the graph, you can see the stack trace of that particular thread. yCrash reporting the stack trace of 9 threads that are in BLOCKED state Here is the screenshot that shows the stack trace of the 9 threads which are in the BLOCKED state and it’s also pointing out the stack trace in which they are stuck. From the stack trace, you can observe that thread is stuck on the com.buggyapp.blockedapp.AppObject#getSomething() method. Equipped with this information, one can easily identify the root cause of the BLOCKED state threads. Video To see the visual walk-through of this post, click below:

By Ram Lakshmanan CORE

What To Be Wary of With Iterators and Collections in C#

This article is not set out to try and persuade you, the reader, that using an iterator or materialized collection will universally solve your problems. Iterator and materialized collection usage can be used to solve the scenarios that we’ll be looking at, but both will come with a different set of pros and cons that we can further explore. The purpose of this article is to highlight scenarios based on real world experiences where an iterator or materialized collection was being misunderstood, misused, and, ultimately, leading to a pile of headaches. As you read this article, if you find yourself saying “Well, sure, but they should have…” you’re probably right. The problem is fundamentally not the usage of an iterator or the materialized collection, but not understanding how to consume them effectively. So, I hope that when you’re working with newer software engineers or people less familiar with some of these concepts, you can be reminded to impart your wisdom. If you’re interested in the companion video to this article, check out this YouTube video: “Hidden Dangers of Iterators and Collections in C#.” Common Iterator and Collection Scenario Setup To give us some common ground as we explore an approach with a materialized collection in contrast with an iterator, let’s expand on the real world examples where I see these challenges regularly coming up. Let’s assume that you have a data access layer in your application that is responsible for getting records from a database or some data store. You build some API that the rest of your application can use, and you’ll be using the results of that API in situations such as: Running LINQ methods (Any(), Count(), or even filtering using Where()). Showing resulting data sets in a user interface. Using resulting data for sorting, filtering, or otherwise running algorithms with this data as the source. Another key ingredient to mention here is that because this is anchored in the real world…code bases change and evolve over time. People come up with new use cases for the data access layer. There’s more data added into the data store pushing limits that people never would have accepted. You have new or more junior developers coming into the code base. It’s real life and until we have more automated tech to police these things, we’re going to run into fun issues. Materializing Large Datasets Before we focus on iterators, let’s explore the more common approach, which involves materialized collections. Given the common scenario we discussed above, you have a method that might look like the following: public List<string> GetEntriesFromDatabase() { // incur some latency for connecting to the database var connection = _connectionFactory.OpenNew(); var command = connection.CreateCommand(); // TODO: actually create the query on the command, but this is just to illustrate var resultsReader = command.Execute(); List<string> results = new List<string>(); while (resultsReader.Read()) { // TODO: pull the data off the reader... this example just uses a single field var value = resultsReader.GetValue(0); results.Add(value); } return results; } There is nothing glaringly wrong with this example, and, in fact, by leaving the actual query up to your imagination, I’ve omitted where a lot of the trouble can come from. Let’s use an example from my GitHub repo to simulate what this might look like so we have a reference point: List<string> PretendGetEntriesFromDatabase() { // let's simulate some exaggerated latency to the DB Thread.Sleep(5000); Console.WriteLine($"{DateTime.Now} - <DB now sending back results>"); // now let's assume we run some query that pulls back 100,000 strings from // the database List<string> results = new List<string>(); while (results.Count < 100_000) { // simulate a tiny bit of latency on the "reader" that would be // reading data back from the database... every so often we'll // sleep a little bit just to slow it down if ((results.Count % 100) == 0) { Thread.Sleep(1); } results.Add(Guid.NewGuid().ToString()); } return results; } Note: the delays in the example code above are artificially inflated so that if you run this in a console, you can observe the different effects of changing the variables. Now that we have the code snippet that simulates pulling from the database by building up a full collection first, let’s look at some calling code that can exercise it (also on GitHub): long memoryBefore = GC.GetTotalMemory(true); Console.WriteLine($"{DateTime.Now} - Getting data from the database using List..."); List<string> databaseResultsList = PretendThisGoesToADatabaseAsList(); Console.WriteLine($"{DateTime.Now} - Got data from the database using List."); Console.WriteLine($"{DateTime.Now} - Has Data: {databaseResultsList.Any()}"); Console.WriteLine($"{DateTime.Now} - Count of Data: {databaseResultsList.Count}"); long memoryAfter = GC.GetTotalMemory(true); Console.WriteLine($"{DateTime.Now} - Memory Increase (bytes): {memoryAfter - memoryBefore}"); The calling code will take a snapshot of memory before we call our method and perform operations on the result. The two things we’ll be doing with the result are: Calling the LINQ method Any(). Calling Count directly on the list. As a side note, the Count() LINQ method will not require full enumeration as it has an optimization to check if there’s a known length. Examining the Materialized Collection Results With the materialized collection example, we can call the method and store the result set in memory. Given the two operations we are trying to use on the collection, Any() and Count, this information is quickly accessible to us because we’ve paid the performance hit one time to materialize the results into a list. Compared to an iterator, this approach does not run the risk of allowing callers to accidentally fully re-enumerate the results. This is because the result set is materialized once. However, the implication here is that, depending on the size of the results and how expensive it might be to fully materialize that full result set, you could be paying a disproportionate price for things like Any() that only need to know the existence of one element before they return true. If you recall what I said at the start of this article: if your mind automatically jumps to “Well someone should build a dedicated query for that,” then…yes, that’s absolutely a solution. But, what I’m hear to tell you is that it’s a very common thing for something like this to slip through the cracks of a code review because of the LINQ syntax we have available to us. Especially if someone rights something like: CallTheMethodThatActuallyMaterializesToAList().Any() In this example, if the method name wasn’t quite so obvious, you’d have no issue with an iterator but a huge concern with a heavy-handed list materialization. Why is it so heavy-handed? Well, one could argue it’s doing exactly what it was coded to do, but we need to consider how callers are going to be taking advantage of this. If callers rarely ever need to be dealing with the full data set and they need to do things like Any(), First() or otherwise lighter weight operations that don’t necessarily need the entire result set…They don’t have a choice with this API. They will be paying the full price to materialize the entire result set when, in reality, maybe they just needed to walk through several elements. In the example code above, this results in multiple megabytes of string data being allocated when we need a count of data and to check if there was any data. Yes, it looks contrived, but this is simply to illustrate that this API design does not lend itself well to particular use cases for callers. Let’s Look at Iterators Let’s go ahead and contrast the previous example with an iterator approach. We’ll start with the code, which you can find on GitHub: IEnumerable<string> PretendThisGoesToADatabaseAsIterator() { // let's simulate some exaggerated latency to the DB Thread.Sleep(5000); Console.WriteLine($"{DateTime.Now} - <DB now sending back results>"); // now let's assume we run some query that pulls back 100,000 strings from // the database for (int i = 0; i < 100_000; i++) { // simulate a tiny bit of latency on the "reader" that would be // reading data back from the database... every so often we'll // sleep a little bit just to slow it down if ((i % 100) == 0) { Thread.Sleep(1); } yield return Guid.NewGuid().ToString(); } } As you can see in the code above, we have an iterator structured to be almost identical except for: It’s an iterator. The yield return keyword is required here. The return type is IEnumerable<T> instead of List<T>. As a quick recap, an iterator will not be able to provide a caller with a count like we could do with other collection types, and all it can do is allow a caller to step through item by item. We can use a similar calling code snippet over our iterator, but let’s go ahead and add in a couple of additional console writing lines (here on GitHub): long memoryBefore = GC.GetTotalMemory(true); Console.WriteLine($"{DateTime.Now} - Getting data from the database using iterator..."); IEnumerable<string> databaseResultsIterator = PretendThisGoesToADatabaseAsIterator(); Console.WriteLine($"{DateTime.Now} - \"Got data\" (not actually... it's lazy evaluated) from the database using iterator."); Console.WriteLine($"{DateTime.Now} - Has Data: {databaseResultsIterator.Any()}"); Console.WriteLine($"{DateTime.Now} - Finished checking if database has data using iterator."); Console.WriteLine($"{DateTime.Now} - Count of Data: {databaseResultsIterator.Count()}"); Console.WriteLine($"{DateTime.Now} - Finished counting data from database using iterator."); long memoryAfter = GC.GetTotalMemory(true); Console.WriteLine($"{DateTime.Now} - Memory Increase (bytes): {memoryAfter - memoryBefore}"); The additional lines of console writing just provide some additional context for where our code will be spending time. Do Iterators Make All the Problems Go Away? The short answer: no. The long answer: Iterators can make some of the earlier issues we saw with materialized collections go away, but they come with their own challenges for folks that are not familiar with working with them. When we consider the memory footprint in this example, it’s nearly nothing in comparison to the prior example. This is the case because at no point in time in this calling code example did we need to have the entire result set materialized for us to answer the questions that we were interested in. Will that always be the case? Absolutely not. However, one of the benefits of iterators here is that a caller now has the choice. These choices include whether they just want to do partial enumeration, full enumeration, or full enumeration to materialize the result set. The key here is flexibility in how the API is consumed. But flexibility comes with a trade, and this is something I see far more frequently with newer C# programmers because they are not actually familiar with iterators. The example above? Sure, it doesn’t use much memory at all but it will run PretendThisGoesToADatabaseAsIterator twice. Yes, to you reader with the keen eyes, you likely already noticed this but with a small adjustment to the naming and calling convention: var results = GetEntriesFromDatabase(); var any = results.Any(); var count = results.Count(); Suddenly, you can’t tell if you’re dealing with an iterator or a materialized collection. Before you shout “Well this is why we never use var!” let me tweak it once more: IEnumerable<string> results = GetEntriesFromDatabase(); var any = results.Any(); var count = results.Count(); The truth is, var doesn’t matter here because you just don’t know if GetEntriesFromDatabase() is an iterator or materialized collection. So, without getting into the weeds of a million different ways we could try and improve this, the point I would like to highlight to you is that people CAN and DO get this messed up in production code bases. All of the time. A bonus round for iterators is that given the lazy nature of how they’re evaluated, I have seen layered architectures pass the enumerable all the way to a frontend to finally have it evaluated. The result was that all of the impressive asynchronous data loading support was completely foiled because the main thread ended up being the unfortunate soul that would call the iterator. Iterators or Materialized Collections? It depends. To be crystal clear, because I mentioned it in the beginning of this article, the intention of writing all of this was not to tell you that an iterator is better or worse than a materialized collection. If you’re a junior software engineer working in C#, then I hope you saw some of the trade-offs and situations where people can get caught up. Some of this stuff isn’t totally obvious and might require some practice using either approach to understand it better. If you’re a more senior software engineer and you read this article being frustrated that you had ways to solve my examples…good. I would like you to take that energy to the team you’re on and ensure you can work with more junior engineers. Help them understand where some of these issues come up and how they can avoid them. My personal preference? I like using iterator-based APIs because I like having the flexibility to stream results. However, after many years of doing this, I am digging into some of the performance characteristics. Especially when we have access to things like spans, I might be heading back out to do a bit more research.

By Nick Cosentino

JSON in Kotlin

In any web service that receives and transmits data to and from a server, the first and last events will usually be transforming the data from the format used by the web request into the format that the web server will handle, and vice versa; these operations are called deserialization and serialization, respectively. For some web services, the thought put towards this part of the flow of data is focused solely on how to configure the serialization mechanism so it works properly. However, there are some scenarios for which every CPU cycle counts, and the faster the serialization mechanism can work, the better. This article will explore the development and performance characteristics of four different options for working with the serialization of JSON messages—GSON, Jackson, JSON-B, and Kotlinx Serialization, using both the Kotlin programming language and some of the unique features that Kotlin offers compared to its counterpart language, Java. Setup Since its first release in 2017, Kotlin has grown by leaps and bounds within the JVM community, becoming the go-to programming language for Android development as well as a first-class citizen in major JVM tools like Spring, JUnit, Gradle, and more. Among the innovations it brought to the JVM community compared to Java was the data class, a special type of class that is to be used primarily as a holder of data (in other words, a Data Transfer Object, or DTO) and automatically generates base utility functions for the class like equals(), hashcode(), copy(), and more. This will form the base of the classes that will be used for the performance tests, the first of which being PojoFoo. “Pojo” stands for “Plain Old Java Object,” signifying using only basic class types of the Java programming language: Kotlin data class PojoFoo(var fizz: String, var bizz: Int, var bazz: List<String>) { constructor() : this("", 0, emptyList()) } For those who are not familiar with the Kotlin programming language: the class has three attributes: fizz bizz bazz That contain both getter and setter functions. There are two constructors for the class: One that requires arguments for each of the attributes. One that requires no arguments and populates the attributes with default values. This second constructor is the “no-arg constructor” that is typically required by JSON serialization mechanisms. In the example above, the three class attributes are marked with the keyword var; this signifies that the attributes are mutable and can be modified at any time during the lifetime of an instance of the class. To make the attributes immutable, all that is needed is to change the designator to val, upon which the attributes will become the equivalent of final attributes in Java, and Kotlin will no longer generate a getter function for the attributes. In addition, this removes the requirement of a no-arg constructor, so that can be eliminated from the code: Kotlin data class ImmutableFoo(val fizz: String, val bizz: Int, val bazz: List<String>) The next example class—DefaultValueFoo—uses a default value for the attribute fizz. This means that, if the constructor of DefaultValueFoo is invoked and no argument is provided for fizz, then the argument will be assigned the default value: Kotlin data class DefaultValueFoo(var fizz: String = "FUZZ", var bizz: Int, var bazz: List<String>) { constructor() : this(bizz = 0, bazz = emptyList()) } Finally, the example class ValueClassFoo changes the type of attribute bizz from a plain integer to an inline class. Inline classes function as wrappers around a single “payload” value; while the Kotlin code will treat the inline class as a “genuine” class, the compiler will translate the code so that only the payload value is present. This provides for several advantages compared to simply using the payload value directly, such as enforcing a type safety for different variables, for example specifying a username and a password type—two types that would normally both be strings—for a login function. In this case, it allows for the usage of UInt: a Kotlin-exclusive class that simulates the behavior of an unsigned function, something that is not supported by default by the JVM: Kotlin data class ValueClassFoo(var fizz: String, var bizz: UInt, var bazz: List<String>) { constructor() : this("", 0u, emptyList()) } Note: the class is named as such because while inline classes are still called as such in the Kotlin documentation, they have been renamed as value classes in the actual code; the keyword inline is deprecated. The Contestants GSON Introduced in 2008 and developed by Google, GSON is one of the main options that Java users employ for conducting serialization between JSON strings and Java objects and is the preferred library to leverage in Android development thanks to the support by Google. Usage The basic usage is to construct an instance of Gson and invoke the functions Gson.toJson() and Gson.fromJson() to serialize an object and deserialize a JSON string, respectively. Working With Kotlin Surprisingly, there are no additional steps necessary to work with the four example classes; all of the code snippets provided above were from the GSON testing code. Jackson Introduced in 2009, Jackson is the other widely-used JSON serialization library—alongside GSON—and is used by default in major JVM ecosystems like the Spring framework. Usage The basic usage is to construct an instance of ObjectMapper and invoke the functions ObjectMapper.writeValueAsString() and ObjectMapper.readValue() to serialize an object and deserialize a JSON string, respectively. Working With Kotlin Unlike GSON, there is quite a bit of work that is necessary to support the Kotlin features in the example classes: Jackson does not have a native concept of deserializing classes that do not possess a no-arg constructor; if it cannot find a no-arg constructor, it will normally raise an exception. A workaround for this is to mark the parameters in the constructor with @JsonProperty so that Jackson knows which argument corresponds to which class attribute: Kotlin data class ImmutableFoo( @param:JsonProperty("fizz") val fizz: String, @param:JsonProperty("bizz") val bizz: Int, @param:JsonProperty("bazz") val bazz: List<String> ) Inline classes are not processed properly due to a difference in how Jackson computes how to conduct serialization and deserialization on a class. An advantage of these serialization libraries is that they do not normally require the creation of specialized classes to conduct the serialization and deserialization actions on a class. Instead, they compute which fields to pull values from and set via reflection; whereas GSON executes the reflection actions on the actual attribute fields within the target class, Jackson’s reflection actions are targeted on the attributes’ getter and setter functions. This is an issue with inline classes, as any function that accepts or returns an inline class is name-mangled to prevent collisions with functions that might accept the equivalent “normal” type in the JVM. Thus, serializing and deserializing classes with inline class attributes will prove problematic: // Attempting to serialize to JSON expected: <{"fizz":"FUZZ","bizz":5,"bazz":["BUZZ","BOZZ"]}> but was: <{"fizz":"FUZZ","bazz":["BUZZ","BOZZ"],"bizz-pVg5ArA":5}> // Attempting to deserialize from JSON Unrecognized field "bizz" (class com.severett.serializationcomparison.jackson.model.ValueClassFoo), not marked as ignorable (3 known properties: "fizz", "bizz-WZ4Q5Ns", "bazz"]) While there is a specialized module for Jackson— jackson-module-kotlin—which provides support for many parts of Kotlin that are not included in the testing here (e.g., Pair, Triple, IntRange, etc.), it does not provide support for inline classes and does not plan on offering support for the foreseeable future. Instead, it is necessary to create custom serializer and deserializer classes to handle ValueClassFoo and mark ValueClassFoo with @JsonSerialize and @JsonDeserialize, respectively: Kotlin class ValueClassFooSerializer : JsonSerializer<ValueClassFoo>() { override fun serialize(value: ValueClassFoo, gen: JsonGenerator, serializers: SerializerProvider?) { gen.writeStartObject() gen.writeStringField(ValueClassFoo.FIZZ_FIELD, value.fizz) gen.writeNumberField(ValueClassFoo.BIZZ_FIELD, value.bizz.toInt()) gen.writeArrayFieldStart(ValueClassFoo.BAZZ_FIELD) value.bazz.forEach(gen::writeString) gen.writeEndArray() gen.writeEndObject() } } class ValueClassFooDeserializer : JsonDeserializer<ValueClassFoo>() { override fun deserialize(jsonParser: JsonParser, ctxt: DeserializationContext?): ValueClassFoo { val node = jsonParser.codec.readTree<JsonNode>(jsonParser) return ValueClassFoo( fizz = node[ValueClassFoo.FIZZ_FIELD].asText(), bizz = node[ValueClassFoo.BIZZ_FIELD].asInt().toUInt(), bazz = (node[ValueClassFoo.BAZZ_FIELD] as ArrayNode).map { it.textValue() } ) } } @JsonSerialize(using = ValueClassFooSerializer::class) @JsonDeserialize(using = ValueClassFooDeserializer::class) data class ValueClassFoo(var fizz: String, var bizz: UInt, var bazz: List<String>) { constructor() : this("", 0u, emptyList()) companion object { const val FIZZ_FIELD = "fizz" const val BIZZ_FIELD = "bizz" const val BAZZ_FIELD = "bazz" } } JSON-B A relative newcomer to the Java world—having been first released only in 2017 alongside JEE 8—JSON-B is an official standard for conducting serialization and deserialization for the JSON data format. The API uses either Eclipse Yasson or Apache Johnzon as the underlying implementation, meaning either one of these libraries would have to be included as a runtime dependency; the tests for this article used Yasson as the implementation. Usage The basic usage is to construct an instance of Jsonb via JsonbBuilder.create() and invoke the functions Jsonb.toJson() and Jsonb.fromJson() to serialize an object and deserialize a JSON string, respectively. Working With Kotlin JSON-B requires the most work of the four libraries evaluated to properly work with Kotlin. JSON-B serializes a class’s attributes in alphabetical order instead of declaration order. While this is not a deal-breaker—JSON objects do not require ordering for key fields—it is necessary to annotate a class with @JsonbPropertyOrder if specific ordering is desired: Kotlin @JsonbPropertyOrder("fizz", "bizz", "bazz") data class PojoFoo(var fizz: String, var bizz: Int, var bazz: List<String>) { constructor() : this("", 0, emptyList()) } Like Jackson, JSON-B requires a no-arg constructor and will fail if it does not encounter one while deserializing a JSON string into a class. Thus, a class without a no-arg constructor will need to mark the constructor that JSON-B needs to use with @JsonbCreator and mark each of the constructor’s arguments with @JsonbProperty so they correspond to the class’s attributes: Kotlin @JsonbPropertyOrder("fizz", "bizz", "bazz") data class ImmutableFoo @JsonbCreator constructor( @JsonbProperty("fizz") val fizz: String, @JsonbProperty("bizz") val bizz: Int, @JsonbProperty("bazz") val bazz: List<String> ) Lastly, JSON-B also shares Jackson’s trait of not being able to handle inline classes properly. Attempting to serialize ValueClassFoo will produce an incorrect output, and while JSON-B will not fail while trying to deserialize a string to ValueClassFoo, it will fail to populate the inline class attribute correctly: // Attempting to serialize to JSON expected: <{"fizz":"FUZZ","bizz":5,"bazz":["BUZZ","BOZZ"]}> but was: <{"bazz":["BUZZ","BOZZ"],"bizz-pVg5ArA":5,"fizz":"FUZZ"}> // Attempting to deserialize from JSON expected: <ValueClassFoo(fizz=FUZZ, bizz=5, bazz=[BUZZ, BOZZ])> but was: <ValueClassFoo(fizz=FUZZ, bizz=0, bazz=[BUZZ, BOZZ])> Like Jackson, the target class will need special serializer and deserializer classes to handle it and be annotated as such: Kotlin class ValueClassFooSerializer : JsonbSerializer<ValueClassFoo> { override fun serialize(valueClassFoo: ValueClassFoo, generator: JsonGenerator, ctx: SerializationContext?) { generator.writeStartObject() generator.write(ValueClassFoo.FIZZ_FIELD, valueClassFoo.fizz) generator.write(ValueClassFoo.BIZZ_FIELD, valueClassFoo.bizz.toInt()) generator.writeStartArray(ValueClassFoo.BAZZ_FIELD) valueClassFoo.bazz.forEach(generator::write) generator.writeEnd() generator.writeEnd() } } class ValueClassFooDeserializer : JsonbDeserializer<ValueClassFoo> { override fun deserialize(jsonParser: JsonParser, ctx: DeserializationContext?, rtType: Type?): ValueClassFoo { var fizz: String? = null var bizz: UInt? = null var bazz: List<String>? = null while (jsonParser.hasNext()) { val event = jsonParser.next() if (event != JsonParser.Event.KEY_NAME) continue when (jsonParser.string) { ValueClassFoo.FIZZ_FIELD -> { jsonParser.next() fizz = jsonParser.string } ValueClassFoo.BIZZ_FIELD -> { jsonParser.next() bizz = jsonParser.int.toUInt() } ValueClassFoo.BAZZ_FIELD -> { jsonParser.next() bazz = jsonParser.array.getValuesAs(JsonString::class.java).map { it.string } } } } if (fizz != null && bizz != null && bazz != null) { return ValueClassFoo(fizz = fizz, bizz = bizz, bazz = bazz) } else { throw IllegalStateException("'fizz', 'bizz', and 'bazz' must be not null") } } } @JsonbTypeDeserializer(ValueClassFooDeserializer::class) @JsonbTypeSerializer(ValueClassFooSerializer::class) data class ValueClassFoo(var fizz: String, var bizz: UInt, var bazz: List<String>) { constructor() : this("", 0u, emptyList()) companion object { const val FIZZ_FIELD = "fizz" const val BIZZ_FIELD = "bizz" const val BAZZ_FIELD = "bazz" } } Kotlinx Serialization Finally, the authors of Kotlin have published their own serialization library for the Kotlin programming language. First released in 2020, the Kotlinx Serialization library is designed for serialization actions in general, not just JSON; while the library only contains official support for JSON, it has experimental support for other formats like Protobuf and CBOR as well as community support for formats like YAML. Usage Unlike the other JSON serialization libraries, there is no instance object that needs to be created for conducting serialization actions. Instead, calls to the extension functions encodeToString() and decodeFromString() are made for the serializing object in question. In this case, the Kotlin object Json. Working With Kotlin Unlike the other JSON serialization libraries, Kotlinx Serialization does not work on custom classes by default. This is due to the way the library works: instead of using reflection like the other libraries, Kotlinx Serialization generates specific serialization and deserialization functions for the target class(es) at compile time. To recognize which classes need this serialization code generated for it, any target classes need to be annotated with @Serializable (a different method is available for third-party classes): Kotlin @Serializable data class PojoFoo(var fizz: String, var bizz: Int, var bazz: List<String>) { constructor() : this("", 0, emptyList()) } In addition, Kotlinx Serialization does not work by default on attributes with a default value. This needs to be enabled with the annotation @EncodeDefault: Kotlin @Serializable @OptIn(ExperimentalSerializationApi::class) data class DefaultValueFoo(@EncodeDefault val fizz: String = "FUZZ", var bizz: Int, var bazz: List<String>) { constructor() : this(bizz = 0, bazz = emptyList()) } Testing Parameters Each of the four JSON serialization libraries conducts serialization and deserialization of the four example classes, and the Java Microbenchmark Harness (JMH) benchmark tests measure the throughput of how many operations get executed per second on average. For example: Kotlin @State(Scope.Benchmark) open class SerializationComparison { private val gson = Gson() @Benchmark fun serializePojoFoo(): String = gson.toJson(pojoFoo) @Benchmark fun serializeImmutableFoo(): String = gson.toJson(immutableFoo) @Benchmark fun serializeDefaultValueFoo(): String = gson.toJson(defaultValueFoo) @Benchmark fun serializeValueClassFoo(): String = gson.toJson(valueClassFoo) @Benchmark fun deserializePojoFoo(): PojoFoo = gson.fromJson(pojoFooStr, PojoFoo::class.java) @Benchmark fun deserializeImmutableFoo(): ImmutableFoo = gson.fromJson(immutableFooStr, ImmutableFoo::class.java) @Benchmark fun deserializeDefaultValueFoo(): DefaultValueFoo = gson.fromJson(defaultValueFooStr, DefaultValueFoo::class.java) @Benchmark fun deserializeValueClassFoo(): ValueClassFoo = gson.fromJson(valueClassFooStr, ValueClassFoo::class.java) } These tests utilize JMH’s defaults of: Five warmup rounds of ten seconds. Five rounds of measurements. Five forked processes to conduct both of the above. The tests are run on a macOS with an Intel Core i7 2.6 GHz 6-Core and 16GB of RAM; the executing JVM is Temurin 19+36. Results Serialization The clear winner among the four libraries is Kotlinx Serialization, as it averages over 5 million operations per second, much faster than the second-place Jackson library. It’d be impossible to identify the exact reasons for why the performance of Kotlinx Serialization is so much higher compared to the competition without diving too deeply into the source code of each library, but a hint may lie in how the other libraries perform much better during the serialization of ValueClassFoo compared to the other example classes (the exception is Kotlinx Serialization, which appears to do worse, but given the error ranges for each result, it’s not statistically significant). For example, running the Java Flight Recorder profiler on Jackson provides the following result in the call tree for serializing PojoFoo: In contrast, here is the call tree for serializing ValueClassFoo: As the two call trees show, creating a special class for the serialization of instances of ValueClassFoo means Jackson does not have to use reflection—a very expensive process, computationally-speaking—to determine what attributes need to be serialized. Of course, this comes with the downside of having more code for the developer to maintain, and will break as soon as the class’s attributes are modified. Deserialization Again, Kotlinx Serialization clearly performs better for deserializing compared to the remaining three libraries. GSON, Jackson, and Kotlinx Serialization all performed markedly better when deserializing instances of DefaultValueFoo, and that’s presumably because there were fewer data to read in for the deserialization test—for that scenario, the libraries had to deserialize {"bizz":5,"bazz":["BUZZ","BOZZ"]}, meaning one less field to parse. Interestingly, Jackson did worse in deserializing ValueClassFoo compared to the other example classes. Again using the Java Flight Recorder profiler, here is a flame graph for Jackson deserializing PojoFoo: Likewise, here is a flame graph for Jackson deserializing ValueClassFoo: It appears that, in contrast to serialization actions, Jackson’s default deserializer is faster than a hand-rolled deserializer. Of course, there wasn’t a choice for doing this in the case of an inline class: it was either creating the custom deserializer or having the code crash. Final Thoughts While the tests provide promising results for the Kotlinx Serialization library, there are a few caveats that must be provided: The example classes were relatively simple to reduce the amount of variables between testing scenarios. Conducting serialization and deserialization actions on large and complex data structures might provide entirely different results in favor of a different serialization library. Due to the Kotlinx Serialization code being developed for the Kotlin programming language, code written in Java would have to be rewritten in Kotlin to use the library, something that might be a very time-consuming endeavor and a hard sell for a project that has a large code base written in Java. The other three libraries, on the other hand, have no such restriction and can be used with Java and Kotlin alike. Regardless, the results suggest that it would behoove Kotlin developers to give the Kotlinx Serialization library a try in their projects, as aside from the high performance, it also provides the opportunity to be a “one-stop shop” for serialization not only for JSON but for other formats like Protobuf, YAML, and more.

By Severn Everett

The YAML Document From Hell — JavaScript Edition

I recently came across this blog post from Ruud van Asseldonk titled “The YAML Document From Hell.” I’ve always heard that YAML has its pitfalls, but hadn’t looked into the details and thankfully hadn’t been affected, mainly due to my very infrequent and simple use of YAML. If you are in the same boat as me, I recommend reading that article now, as I almost can’t believe I’ve avoided any issues with it. The article digs into the issues in the YAML spec itself and then describes what happens in Python’s PyYAML and Golang’s YAML library with an example file, the titular YAML document from hell. I wanted to see how things were in the JavaScript ecosystem. YAML in JavaScript A search for JavaScript YAML parsers on npm brings up YAML (which I have used in my own project) and js-yaml. js-yaml has the most weekly downloads according to npm and the most stars on GitHub however, YAML seems to be under more active development, having been most recently published (a month ago at the time of writing) compared to js-yaml’s last publish date almost 2 years ago. There is also yamljs, but the project hasn’t received a commitment since November 2019 and hasn’t been released for 6 years, so I am going to disregard it for now. Let’s see what YAML and js-yaml do with the YAML document from hell. The Document Itself To save yourself from going back and forth between van Asseldonk’s article and this one, here is the YAML document. server_config: port_mapping: # Expose only ssh and http to the public internet. - 22:22 - 80:80 - 443:443 serve: - /robots.txt - /favicon.ico - *.html - *.png - !.git # Do not expose our Git repository to the entire world. geoblock_regions: # The legal team has not approved distribution in the Nordics yet. - dk - fi - is - no - se flush_cache: on: [push, memory_pressure] priority: background allow_postgres_versions: - 9.5.25 - 9.6.24 - 10.23 - 12.13 So how do our JavaScript libraries handle this file? The Failures Anchors, Aliases, and Tags Let’s start with the failures. As described in the original article under the subhead “Anchors, aliases, and tags” this section is invalid: serve: - /robots.txt - /favicon.ico - *.html - *.png - !.git # Do not expose our Git repository to the entire world. This causes both of our JavaScript YAML libraries to throw an error, both referencing an undefined alias. This is because the * is a way to reference an anchor created earlier in the document using an &. In our document’s case, that anchor was never created, so this is a parsing error. If you want to learn more about anchors and aliases it seems like something that is important in build pipelines. Both Bitbucket and GitLab have written about how to use anchors to avoid repeating sections in yaml files. For the purposes of trying to get the file to parse, we can make those aliases strings as they were likely intended. serve: - /robots.txt - /favicon.ico - "*.html" - "*.png" - !.git # Do not expose our Git repository to the entire world. Now we get another parsing error from our libraries; both of them complain about an unknown or unresolved tag. The ! at the start of !.git is the character triggering this behaviour. Tags seem to be the most complicated part of YAML to me. They depend on the parser you are using and allow that parser to do something custom with the content that follows the tag. My understanding is that you could use this in JavaScript to, say, tag some content to be parsed into a Map instead of an Object or a Set instead of an Array. Van Asseldonk explains this with this alarming sentence: This means that loading an untrusted YAML document is generally unsafe, as it may lead to arbitrary code execution. PyYaml apparently has a safe_load method that will avoid this, but Go’s yaml package doesn’t. It seems that the JavaScript libraries also lack this feature, so the warning for untrusted YAML documents stands. If you do want to take advantage of the tag feature in yaml, you can check out the yaml package’s documentation on custom data types or js-yaml’s supported yaml types and unsafe type extensions. To make the YAML file parse, let’s encase all the weird yaml artifacts in quotes to make them strings: serve: - /robots.txt - /favicon.ico - "*.html" - "*.png" - "!.git" # Do not expose our Git repository to the entire world. With the serve block looking it does above, the file now parses. So what happens to the rest of the potential yaml gotchas? Accidental Numbers One thing that I am gathering from this investigation so far is that if you need something to be a string, do not be ambiguous about it, surround it in quotes. That counted for the aliases and tags above and it also counts for accidental numbers. In the following section of the yaml file you see a list of version numbers: allow_postgres_versions: - 9.5.25 - 9.6.24 - 10.23 - 12.13 Version numbers are strings, numbers can’t have more than one decimal point in them. But when this is parsed by either JavaScript library the result is as follows: allow_postgres_versions: [ '9.5.25', '9.6.24', 10.23, 12.13 ] Now we have an array of strings and numbers. If a YAML parser thinks something looks like a number it will parse it as such. And when you come to use those values they might not act as you expect. Version Numbers in GitHub Actions I have had this issue within GitHub Actions before. It was in a Ruby project, but this applies to anyone trying to use version numbers in a GitHub Actions YAML file. I tried to use a list of Ruby version numbers, this worked fine up until Ruby version 3.1 was released. I had 3.0 in the array. Within GitHub Actions this was parsed as the integer 3. This might seem fine, except that when you give an integer version to GitHub Actions it picks the latest minor point for that version. So, once Ruby 3.1 was released, the number 3.0 would select version 3.1. I had to make the version number a string, "3.0", and then it was applied correctly. Accidental numbers cause issues. If you need a string, make sure you provide a string. The Successes It’s not all bad in the JavaScript world. After working through the issues above, we might now be in the clear. Let’s take a look now at what parsed correctly from this YAML file. Sexagesimal Numbers Under the port mapping section of the YAML file we see: port_mapping: # Expose only ssh and http to the public internet. - 22:22 - 80:80 - 443:443 That 22:22 is dangerous in yaml version 1.1 and PyYaml parses it as a sexagesimal (base 60) number, giving the result of 1342. Thankfully both JavaScript libraries have implemented YAML 1.2 and 22:22 is parsed correctly as a string in this case. port_mapping: [ '22:22', '80:80', '443:443' ] The Norway Problem In YAML 1.1 no is parsed as false. This is known as “the Norway problem” because listing countries as two character identifiers is fairly common and having this YAML: geoblock_regions: - dk - fi - is - no - se Parsed into this JavaScript: geoblock_regions: [ 'dk', 'fi', 'is', false, 'se' ] It is just not helpful. The good news is that, unlike Go’s YAML library, both JavaScript libraries have implemented YAML 1.2 and dropped no as an alternative for false. The geoblock_regions sections is successfully parsed as follows: geoblock_regions: [ 'dk', 'fi', 'is', 'no', 'se' ] Non-String Keys You might believe that keys in YAML would be parsed as strings, like JSON. However they can be any value. Once again there are values that may trip you up. Much like with the Norway problem in which yes and no can be parsed as true and false, the same goes for on and off. This is manifested in our YAML file in the flush_cache section: flush_cache: on: [push, memory_pressure] priority: background Here the key is on, but in some libraries it is parsed as a boolean. In Python, even more confusingly the boolean is then stringified and appears as the key "True". Thankfully this is handled by the JavaScript libraries and on becomes the key "on". flush_cache: { on: [ 'push', 'memory_pressure' ], priority: 'background' } This is of particular concern in GitHub Actions again, where on is used to determine what events should trigger an Action. I wonder if GitHub had to work around this when implementing their parsing. Parsing as YAML Version 1.1 Many of the issues that our JavaScript libraries sidestep are problems from YAML 1.1 and both libraries have fully implemented YAML 1.2. If you do wish to throw caution to the wind, or you have to parse a yaml file explicitly with YAML 1.1 settings, the YAML library can do that for you. You can pass a second argument to the parse function to tell it to use version 1.1, like so: import { parse } from "yaml"; const yaml = parse(yamlContents, { version: "1.1" }); console.log(yaml); Now you get a result with all of the fun described above: { server_config: { port_mapping: [ 1342, '80:80', '443:443' ], serve: [ '/robots.txt', '/favicon.ico', '*.html', '*.png', '!.git' ], geoblock_regions: [ 'dk', 'fi', 'is', false, 'se' ], flush_cache: { true: [ 'push', 'memory_pressure' ], priority: 'background' }, allow_postgres_versions: [ '9.5.25', '9.6.24', 10.23, 12.13 ] } } Note that in this case I left the aliases and tags quoted as strings so that the file could be parsed successfully. Stick with version 1.2, the default in both JavaScript YAML libraries, and you’ll get a much more sensible result. Isn’t YAML Fun? In this post we’ve seen that it’s easy to write malformed YAML if you weren’t aware of aliases or tags. It’s also easy to write mixed arrays of strings and numbers. There are also languages and libraries in which YAML 1.1 is still hanging around and on. yes, off, and no are booleans and some numbers can be parsed into base 60. My advice, after going through all of this, is to err on the side of caution when writing YAML. If you want a key or a value to be a string, surround it in quotes and explicitly make it a string. On the other hand, if you are parsing someone else’s yaml then you will need to program defensively and try to handle the edge cases, like accidental numbers, that can still cause issues. Finally, if you have the option, choose a different format to YAML. YAML is supposed to be human-friendly, but the surprises and the bugs that it can produce are certainly not developer-friendly and ultimately that defeats the purpose. The conclusion to the original YAML document from hell post suggests many alternatives to YAML that will work better. I can’t help but think that in the world of JavaScript that something JSON based, but friendlier to author, should be the solution. There is a package that simply strips comments from JSON or there’s JSON5 a JSON format that aims to be easier to write and maintain by hand. JSON5 supports comments as well as trailing commas, multiline strings, and various number formats. Either of these are a good start if you want to make authoring JSON easier and parsing hand authored files more consistent. If you can avoid YAML, I recommend it. If you can’t, good luck.

By Phil Nash

Using Regular Expressions in Python: A Brief Guide

Regular expressions are effective tools for pattern matching and text processing. A regex, or regular expression, is a group of characters that forms a search pattern. To determine whether a string contains a particular search pattern, use RegEx. They are supported by many programming languages, including Python, which provides a powerful and flexible regular expression engine that can handle a wide range of text-matching tasks. This article will provide a brief guide to using regular expressions in Python. What Are Regular Expressions? Regular Expressions (RegEx) are unique character combinations that use a search pattern to locate a string or group of strings, such as finding all email addresses in a document or validating the format of a phone number. It can distinguish between the presence and absence of a text by comparing it to a specific pattern. It may also divide a pattern into one or more sub-patterns. The use of regex in Python is supported via the re-module, which is provided by Python. Its main purpose is to provide a search; to do this, a regular expression and a string are required. In this case, it either returns the first match or none at all. Regular expressions are used to match patterns in text. Regular expressions are often used in text editors, command-line utilities, and programming languages. Regular expressions consist of two types of characters: Literals: These are characters that match themselves. For example, the letter "a" will match the letter "a" in a text string. Metacharacters: These are special characters that have a special meaning. For example, the dot (.) metacharacter matches any single character. Using Regular Expressions in Python Python provides a built-in module called "re" that provides regular expression support. This module provides several functions for working with regular expressions, including searching for matches, replacing matches, and splitting a string into a list of substrings based on a pattern. The "re" module also provides several special characters that can be used to create complex regular expressions. Here are some of the most commonly used special characters in regular expressions: Character Description . any single character, excluding the newline (\n), is matched. For example, the regular expression for he.. will match for “hell”,” help,” etc. * compares to 0 or more occurrences of the preceding character. For example, the regular expression a* will match zero or more occurrences of the letter "a". + matches to one or more occurrences of the preceding character. For example, the regular expression a+ will match one or more occurrences of the letter "a". ? matches either zero or one instance of the preceding character. For example, the regular expression colo?r will match both "color" and "colour". {m,n} Matches the previous character between m and n times. For example, the regular expression a{2,3} will match either "aa" or "aaa". [] Matches any single character within the brackets. For instance, the regular expression [aeiou] will match any kind of vowel. \ Used to drop the special meaning of the character following it. For example, the regular expression \. will match a period character. ^ The string should start with the characters following ^. For example, the regular expression ^hello will match only if the sentence starts with hello. $ The string should end with the characters following $. For example, the regular expression hello$ will match only if the sentence ends with hello. | Either or. For example, the regular expression suman|ritik Check if the string contains either "suman" or "ritik" Let’s discuss some important of these metacharacters in detail: . – Dot Except for the newline character (\n), the dot (.) symbol only recognizes one character. For instance: a.b will look for any character other than a dot in the string, including acb, acbd, abbb, etc. .. will determine whether the string has at least two characters. * – Star Star (*) symbol matches zero or more instances of the regex that comes before the star symbol. For instance: Because b is not followed by c, ab*c will be matched for the strings ac, abc, abbbc, dabc, etc., but not for abdc. + - Plus One or more instances of the regex that comes before the + symbol are matched by the Plus (+) symbol. For instance : Because there is no b in ac and b is not followed by c in abdc, ab+c will match for the string abc, abbc, dabc but not for ac, abdc. ? - Question The question mark (?) determines whether the string in the regex appears at least once or not at all. For instance: As there are two b’s in the string abbc, it will not be matched. However, ab?c will be matched for the strings ac, acb, and dabc. Because b is not followed by c, it will also not match for abdc. Braces {m, n} All repetitions from m to n, inclusive, before the regex are matched by the braces. Example- The strings aaab, baaac, and gaad will be matched for regular expression a{2, 4}, but it won't be matched for strings like abc, bc because there is either just one an or none at all in both situations. Square brackets []. A character class made up of a group of characters that we want to match is represented by square brackets ([]). The character class [abc] will, for instance, match any single a, b, or c. With the - symbol between the square brackets, we can also specify a range of characters. For instance: The sample for [0123] is [0,3]. The sample for [abc] is [a-c]. The caret(^) sign can be used to reverse the character class as well. For instance: [^0-3] denotes any number other than 0 and 1 or 3. [^a-c]Any character that is not an a, b, or c. \ Backslash To ensure that the character is not given special treatment, use the backslash (/). This could be thought of as a metacharacter escape. As an illustration, the dot (.) will be treated as a special character and one of the metacharacters if you want to search for it in the string (as shown in the above table). In order to prevent it from losing its specialization, we will employ the backslash (/) before the dot (.) in this instance. The example below will help you understand. Code: Python import re s = 'suman.singh' # without using \ match = re.search(r'.', s) print(match) # using \ match = re.search(r'\.', s) print(match) Output: Python <re.Match object; span=(0, 1), match='s'> <re.Match object; span=(6, 7), match='.'> | - Or Symbol Determines whether the pattern before or after the or symbol is present in the string. For instance: Any string that contains either an or b, such as acd, bcd, abcd, etc., will be matched by a|b. Special Sequences Special sequences provide the precise position in the search string where the match must take place rather than matching for the actual character in the string. It makes it simpler to write patterns that are used frequently. Special Sequences List Special Sequence Description Examples \A matches if the specified character appears at the start of the string. \Afor -> for suman \b Matches if the provided character either starts or finishes the word. \b(string) will look for the word's beginning, and \b(string) will look for the word's ending. \bsh -> suman \B In contrast to the \b, the string shall not begin or end with the specified pattern. \Bge -> together \d This is similar to the set class [0-9] because it matches any decimal digit. \d -> 1526 \D matches any character that is not a digit; this is the same as the set class [0-9]. \D -> suman \s each whitespace character is a match. \s -> sum an \S any non-whitespace character is a match. \S -> s uman \w This is comparable to the class [a-zA-Z0-9_] and matches any alphanumeric character. \w -> 3425 \W any non-alphanumeric character is matched. \W -> >$ \Z matches if the string contains the specified regex at the end. an\Z -> suman Basic Regular Expression Operations 1. Searching for Matches The most basic operation in regular expressions is searching for a match in a string. The "re" module provides the "search" function for this purpose. Here is an example of how to use the "search" function to find a pattern in a string: Code: Python import re text = "Suman Raghav and Ron are friends" pattern = "friends" result = re.search(pattern, text) if result: print("String Pattern Found") else: print("String Pattern not Found") This code will output "String Pattern Found" because the pattern "friends" is found in the text. 2. Replacing Matches Another common operation in regular expressions is replacing matches in a string. The "re" module provides the "sub" function for this purpose. Here is an example of how to use the "sub" function to replace a pattern in a string: Code: Python import re text = "Suman Raghav and Ron are friends" pattern = "friends" replacement = "students" result = re.sub(pattern, replacement, text) print(result) This code will output "Suman Raghav and Ron are students" because the pattern "friends" is replaced with "students" in the original text. 3. Splitting a String Based on a Pattern The "re" module can also be used to split a string into a list of substrings based on a pattern. The split function is used for this purpose. Here is an example of how to use the "split" function to split a string based on whitespace characters: Code: Python import re text = "Suman Raghav and Ron are friends" result = re.split("\s", text) print(result) This code will output ["Suman", "Raghav", "and", "Ron", "are", "friends"] because the string is split based on whitespace characters. 4. Regular Expression Flags Regular expressions in Python support flags that modify the behavior of the regular expression engine. Flags are specified as an optional second argument to the regular expression function. Some of the most widely used flags are listed below: re.IGNORECASE or re.I: Makes the regular expression case-insensitive. re.MULTILINE or re.M: Allows the ^ and $ metacharacters to match the beginning and end of each line in a multiline string rather than just the beginning and end of the entire string. re.DOTALL or re.S: Makes the dot (.) metacharacter match any character, including a newline character (\n). re.ASCII or re.A: Limits the regular expression engine to ASCII characters only. Here is an example of how to use the IGNORECASE flag to make a regular expression case-insensitive: Code: Python import re text = "Suman has a brown coloured bag." pattern = "BROWN" result = re.search(pattern, text, re.IGNORECASE) if result: print("String Pattern Found") else: print("String Pattern not Found") This code will output "String Pattern Found" because the pattern "BROWN" is found in the text, even though it is in uppercase and the search was performed with the IGNORECASE flag. 5. Grouping and Capturing Regular expressions in Python also support the grouping and capturing of substrings within a match. Grouping is achieved using parentheses (()). The contents of the first group are captured and can be accessed using the "group" method of the match object. Here is an example of how to use grouping and capturing in regular expressions: Code: Python import re text = "Suman Singh (sumansingh@example.com) wrote an email" pattern = "(\w+@\w+\.\w+)" result = re.search(pattern, text) if result: print("Email address validated: " + result.group(1)) else: print("Email address not validated") This code will output "Email address validated: sumansingh@example.com" because the regular expression matches the email address in the text and captures it using a group. Conclusion For text processing and pattern matching, regular expressions are an effective tool. The "re" module in Python provides a flexible and powerful regular expression engine. Special characters such as ., *, +, ?, ^, $, [], (), and | are used to define patterns in regular expressions. The most commonly used regular expression functions in Python are "search", "match", "findall", "sub", and "split". Regular expression flags such as re.IGNORECASE, re.MULTILINE, re.DOTALL, and re.ASCII can modify the behavior of the regular expression engine. Grouping and capturing of substrings within a match can be achieved using parentheses (()) and the "group" method of the match object.

By Sarang S Babu

CSS Variables Scoping to Create and Theme Flexible UI Components

The real name of the CSS variables is CSS Custom Properties is a draft standard (yes, when I wrote these lines, it is on Candidate Recommendation Snapshot), but it is widely supported by modern browsers. CSS variables allow us, like another kind of variable in another programming language, to store a value we can reuse across our document. For example, if we define a CSS variable for the primary color doing the following: --primary-color: #f00;, then we can use it in any component like: .my-component { color: var(--primary-color); } Usually, you “attach” your variable to :root, which means the variable will be available in all the document :root { color: var(--primary-color); } In this example :root is the variable scope. Using Together SCSS If you want to assign values from SCSS variables to CSS variables, you can not do the “normal” notation: // ❌ This doesn't work $scss-var: #f00; --my-var: $scss-var; In the example, the value of --my-var is literally $scss-var, not the value of $scss-var, this behavior was done to provide maximum compatibility with the plain CSS. To make it work, you need to use the Sass interpolation syntax: #{my scss script code}: // ✅ This works $scss-var: #f00; --my-var: #{$scss-var}; Scope The variables are only available in the element where it is defined and its children; that is the scope of the variable. Outside there, the variable doesn’t exist. If you try to access to use a variable that is not in the scope, you will not get an error, but the property that is using the not existing variable will be ignored. Hoisting Like the JS variables, the CSS variables are moved to the top, so you can use them before defining them. .my-element { color: var(--primary-color); } :root { --primary-color: #f00; } Override As I mentioned before, the variables have a scope where the variable exists, but: what happens if a variable with the same name is defined in two scopes: It happens the same as in a JS variable; the near local scope overrides other values: :root { --color: #0f0; } .my-element { --color: #0ff; color: var(--color); } This behavior is very convenient when we work with UI components with different styles depending on modifiers. CSS Variables in UI components Imagine we have a simple button component like that. <button class="ui-button"> Button content </button> .ui-button { background: #333; color: #fff; font-size: 12px; padding: 4px 10px; } This button has different variants by color (default, red and green) and size (default, small and big); using BEM, we can add a modifier class like .ui-button--green or .ui-button--big and use that to overwrite the styles, for example: .ui-button { background: #333; color: #fff; font-size: 12px; padding: 4px 10px; &--green { background: #1F715F; } &--big { font-size: 16px; padding: 6px 20px; } } This way works perfectly, but we need to know which properties to overwrite, and need to do it explicitly for each modifier, so it’s easy to forget something, or if we need to add a new property affected by the modifiers, add it in all of them Suppose we rewrite the styles using CSS variables, parameterizing the component styles. In that case, we can override the CSS variable values for each modifier without changing the CSS styles itself for the modifiers, only changing the value of the variables: .ui-button { --bg-color: #333; --text-color: #fff; --font-size: 12px; --padding: 4px 10px; background: var(--bg-color); color: var(--text-color); font-size: var(--font-size); padding: var(--padding); &--green { --bg-color: #1F715F; } &--red { --bg-color: #0ff; } &--big { --font-size: 16px; --padding: 6px 20px; } &--small { --font-size: 10px; --padding: 3px 5px; } } Variable Scope Priority In CSS, the elements can use more than a class, so that means the element’s CSS variables have multiple scopes at the same level; for example, if we apply the green and red modifiers at the same time <button class="ui-button ui-button--green ui-button--red"> Green + red </button> Both ui-button--green and ui-button--red define the same --bg-color variable, What value will be applied to the element? In cases like that, the class order is the priority, so the last class used overrides the value last, and its value is applied; in the example, the button will be red, but for <button class="ui-button ui-button--red ui-button--green"> the button will be green. Summarizing The use of CSS variables and scopes is a powerful tool when you are developing components in general. Still, if your components have modifiers, it requires extra work in the beginning to parameterize the component, but after that makes it simpler to create variants and modifiers.

By Sergio Carracedo

Languages

DZone's Featured Languages Resources

Top Languages Experts

The Latest Languages Topics