Daniel Lemire's blog
143 subscribers
18 links
This channel can be used to follow Daniel Lemire's blog. It is COVID-free. If you would like news on COVID, follow https://t.me/covidinfoenglish
Download Telegram
to view and join the conversation
Many of us feel that the current intellectual climate is difficult to bear. When I first noticed the phenomenon, people told me that it was because of Donald Trump. He just made it impossible to debate calmly. But now that Trump is gone, the climate is just as bad and, if nothing else, much worse.

Debates are essential in a free society. The alternative to debate is force. Either you convince your neighbour to do as you think they should do, or else you send men with guns to his place.

It is tempting, when you have the upper hand, to use force and aggressive tactics against your opponents. However, this leaves little choice to your opponents: they have to return the favour. And if you both live long enough, chances are that they will.

Civility is a public good. We have to all commit to it.

I do not pretend to be the perfect debater. I make mistakes all the time. However, I try to follow these rules.

Do not hope to change people’s core stance. This rarely, if ever, happens. That is not why we debate. If someone is in favour of Brexit and you are not, you can argue until you are blue in the face and they won’t change their stance. One of the core reasons to debate is to find common ground. People will naturally shy away from arguments that are weak. You can see a debate as a friendly battleground. Once the battle is over, you have probably not taken the other person’s moat, but if you did your job, you have enticed them to drop bad arguments. And they have done the same: they have exposed weaknesses in your models. It implies that the debate should bear on useful elements, like arguments and facts. It also implies that debate can and should be productive… even if it never changes anyone’s stance.
Your goal in a debate is neither to demonstrate that the other person is bad or that you are good. Let people’s character out of the debate. This include your own character. For example, never argue that you are a good person. Reject character assassination, either of yourself or of others. The most popular character assassination tactic is “by association”: “your employer once gave money to Trump so you are a racist”. “You read this news source so you are part of their cult.” You must reject such arguments, whether they are applied to you or to others. Another popular tactic is to question people’s motives. Maybe someone works for a big oil company, so that explains why they are in favour of Brexit. Maybe someone is a member of the communist party, and that’s why they want to give the government more power. It is true that people’s motives impact their opinions, but it has no room in civil debate. You can privately think that a given actor is “sold out”, but you should not say it.
Shy away from authority-based arguments. Saying that such and such is true because such and such individual says so, is counterproductive because the other side can do the same and the debate will be sterile. You can and should provide references and sources, but for the facts and arguments that they carry, not for their authority.
I believe that a case can be made that without a good intellectual climate, liberalism is bound to fade away. If you want to live in a free society, you have to help enforce good debates. If you are witnessing bad debates, speak up. Remind people of the rules. In fact, if I deviate from these rules, remind me: I will thank you. https://lemire.me/blog/2021/09/09/how-i-debate/
Channel photo removed
For software performance, can you always trust inlining?

It is easier for an optimizing compiler to spot and eliminate redundant operations if it can operate over a large block of code. Nevertheless, it is still recommended to stick with small functions. Small functions are easier to read and debug. Furthermore, you can often rely on the compiler smartly inline your small functions inside larger functions. Furthermore, if you have a just-in-time compiler (e.g., in C# and Java), the compiler may often focus its energy on functions that are called more often. Thus small functions are more likely to get optimized. Nevertheless, you sometimes want to manually inline your code. Even the smartest compilers get it wrong. Furthermore, some optimizing compilers are simply less aggressive than others. We have been working on a fast float-parsing library in C# called csFastFloat. Though it is already several times faster than the standard library, the primary author (Verret) wanted to boost the performance further by using SIMD instructions. SIMD instructions are fancy instructions that allow you to process multiple words at once, unlike regular instructions. The C# language allows you to use…

https://lemire.me/blog/2021/10/09/for-software-performance-can-you-always-trust-inlining/
Science and Technology links (October 10th 2021)

Evans and Chu suggest, using data and a theoretical model, that as the number of scientists grow, progress may stagnate. Simply put, in a large field, with many researchers, a few papers and a few people are able to acquire a decisive advantage over newcomers. Large fields allow more inequality. One can review their model critically. Would you not expect large fields to fragment into smaller fields? And haven’t we seen much progress in fields that have exploded in size? Keeping your iron stores low might be important to slow your aging. Sadly, much of what you eat has been supplemented with iron because a small fraction of the population needs iron supplementation. It is widely belief that you cannot have too much iron supplementation, but, to my knowledge, long-term effects of iron supplementation have not been carefully assessed. If you lose weight while having high levels of insulin, you are more likely to lose lean tissues (e.g., muscle) than fat. A drug similar to viagra helps mice fight obesity. Age-related hair loss might be due to stem cells escaping…

https://lemire.me/blog/2021/10/10/science-and-technology-links-october-10th-2021/
Channel name was changed to «Daniel Lemire's blog»
Calling a dynamically compiled function from Go

Compiled programming languages are typically much faster than interpreted programming language. Indeed, the compilation step produces “machine code” that is ideally suited for the processor. However, most programming languages today do not allow you to change the code you compiled. It means that if you find out which function you need after the code has been compiled, you have a problem. It happens all of the time. For example, you might have a database engine that has to do some expensive processing. The expensive work might be based on a query that is only provided long after the code has been compiled and started. It means that your performance could be limited because the code that runs the query was written without sufficient knowledge of the query. Let us illustrate the issue and a possible solution in Go (the programming language). Suppose that your program needs to take the power of some integer. You might write a generic function with a loop, such as the following: func Power(x int, n int) int { product := 1 for i := 0;…

https://lemire.me/blog/2021/10/14/calling-a-dynamically-compiled-function-from-go/
Science and Technology links (October 16th 2021)

The thymus is an important component of our immune system. As we age, the thymus degenerates and our immune system becomes less fit: emotional and physical distress, malnutrition, and opportunistic bacterial and viral infections damage the thymus. New research suggests that practical thymus regeneration could be closer than ever. The thymus is able to repair itself and the right mix of signals could convince it to stay fit over time. It appears that women use sexual vocalization during penetration to increase their sexual attractiveness to their current partner by means of boosting their partner’s self-esteem. People associate creativity with effortless insight and undervalue persistence. If you just sit down and work hard, you can generate many new ideas. If you just wait for ideas to come, you will generate far fewer ideas. 63% of faculty perceive that their university does not care about the content of their publications, only where and how much is published. Such a culture can easily lead to people producing work that passes peer review but fails to contribute meaningfully to science or engineering. Redheaded women…

https://lemire.me/blog/2021/10/16/science-and-technology-links-october-16th-2021/
Converting binary floating-point numbers to integers

You are given a floating-point number, e.g. a double type in Java or C++. You would like to convert it to an integer type… but only if the conversion is exact. In other words, you want to convert the floating-point number to an integer and check if the result is exact. In C++, you could implement such a function using few lines of code: bool to_int64_simple(double x, int64_t *out) { int64_t tmp = int64_t(x); *out = tmp; return tmp == x; } The code is short in C++, and it will compile to few lines of assembly. In x64, you might get the following: cvttsd2si rax, xmm0 pxor xmm1, xmm1 mov edx, 0 cvtsi2sd xmm1, rax mov QWORD PTR [rdi], rax ucomisd xmm1, xmm0 setnp al cmovne eax, edx You could do it in a different manner. Instead of working with high-level instructions, you could copy your binary floating-point number to a 64-bit word and use your knowledge of the IEEE binary64 standard to extract the mantissa and the exponent. It is much more code. It also involves pesky branches.…

https://lemire.me/blog/2021/10/21/converting-binary-floating-point-numbers-to-integers/
Science and Technology links (October 23rd 2021)

Apple announced new processors for its computers. Here is a table with the transistor count of some recent Apple processors: processor release year transistors Apple A7 2013 1 billions Apple A8 2014 2 billions Apple A9 2015 2 billions Apple A10 2016 3.2 billions Apple A11 2017 4.3 billions Apple A12 2018 6.9 billions Apple A13 2019 8.5 billions Apple A14 2020 11.8 billions Apple M1 2020 16 billions Apple M1 Max 2021 57 billions We find that the number of transistors on commodity processors doubles every two years, approximatively. Furthermore, the energy usage of chips is trending downward: the Apple M1 Max reportedly uses less than 100 Watts, which is comparable with processors of the last two decades despite a much greater performance. The storage capacity and bandwidth is also trending upward: the Apple M1 Max has 64GB of RAM with a bandwidth of 400GB/s. However, our processors are increasingly parallel in nature: the M1 Max has ten generic processing cores and 32 graphical cores. You are unlikely to be able to make use of the 64 GB of…

https://lemire.me/blog/2021/10/23/science-and-technology-links-october-23rd-2021/
In C++, is empty() faster than comparing the size with zero?

Most C++ programmers rely on “STL” for their data structures. The most popular data structure is probably vector, which is just a dynamic array. The set and the map are other useful ones. The STL data structures are a minimalist design. You have relatively few methods. All of them allow you to compute the size of the data structure, that is, how many elements it contains, via the size() method. In recent C++ (C++11), the size() method must have constant-time complexity for all containers. To put it in clearer terms, the people implementing the containers can never scan the content to find out the number of containers. These containers also have another method called empty() which simply returns true of the container is… well… empty. Obviously, an equivalent strategy would be to compare the size with zero:  mystruct.size() == 0. Obviously, determining whether a data structure is empty is conceptually easier than determining its size. Thus, at least in theory, calling empty() could be faster. Inspecting the assembly output, I find that recent versions of GCC produce nearly identical code…

http://lemire.me/blog/2021/10/26/in-c-is-empty-faster-than-comparing-the-size-with-zero/
In C, how do you know if the dynamic allocation succeeded?

In the C programming language, we allocate memory dynamically (on the heap) using the malloc function. You pass malloc a size parameter corresponding to the number of bytes you need. The function returns either a pointer to the allocated memory or the NULL pointer if the memory could not be allocated. Or so you may think. Let us write a program that allocates 1 terabytes of memory and then tries to write to this newly allocated memory: #include #include int main() { size_t large = 1099511627776; char *buffer = (char *)malloc(large); if (buffer == NULL) { printf("error!n"); return EXIT_FAILURE; } printf("Memory allocatedn"); for (size_t i = 0; i < large; i += 4096) { buffer[i] = 0; } free(buffer); return EXIT_SUCCESS; } After running and compiling this program, you would expect to either get the message “error!”, in which case the program terminates immediately… or else you might expect to see “Memory allocated” (if 1 terabyte of memory is available) in which case the program should terminate successfully. Under both macOS/clang and Linux/GCC, I find that the program…

https://lemire.me/blog/2021/10/27/in-c-how-do-you-know-if-the-dynamic-allocation-succeeded/
Science and Technology (October 31st 2021)

Though exoskeletons are exciting and they allow some of us to carry one with physical activities despite handicaps, they appear to require quite a bit of brain power. In effect, though they may help you move, they require a lot of mental effort which can be distracting. It is difficult to make people smarter by changing their environment. Socioeconomic background effects on children’s cognitive development and student achievement are likely to be spurious according to new research. There is a U-shaped association between cardiovascular health and physical activity. A moderate amount of physical activity is really good for you, but you can do too much. In richer countries where women are treated as equals to men, there are greater differences between boys’ and girls’ aspirations. For example, boys are more likely to be attracted to science compared to girls in a rich country. Scientists spend ever more time in school for relatively few desirable positions. These scientists are then ever more likely to pursue post-doctoral positions. These longer studies are a significant net loss of their lifetime income.

https://lemire.me/blog/2021/10/31/science-and-technology-october-31st-2021/
Stop spending so much time being trolled by billionaire corporations!

As a kid, my parents would open the television set, and we would get to watch whatever the state television decided we would watch. It was a push model. Some experts pick the content you need and they deliver it to you. You have little say in the matter. There was one newspaper in my town. We had two or three TV channels. Traditional schools also operate on a push model: the teacher decides whatever you get to learn. So I got to watch hours and hours of incredibly boring TV shows because there was nothing good on. I was very interested in computers and science, but there was almost nothing relevant in the major news sources. When personal computers became popular, I quickly learned more about them than any journalist had. In the early days of the Internet, people wrote on posting boards. Some started blogs. To this day, I get much of my online news by an RSS aggregator which collects information from various blogs and news sites. An RSS aggregator simply picks up all of the news items…

https://lemire.me/blog/2021/11/02/stop-spending-so-much-time-being-trolled-by-billionaire-corporations/
Checking simple equations or inequalities with z3

When programming, you sometimes need to make sure that a given formula is correct. Of course, you can rely on your mastery of high-school mathematics, but human beings, in general, are terrible at formal mathematics. Thankfully, you can outsource simple problems to a software library. If you are a Python user, you can install z3 relatively with pip: pip install z3-solver. You are then good to go! Suppose that you want to be sure that ( 1 + y ) / 2 < y for all 32-bit integers y. You can check it with the following Python script: import z3 y = z3.BitVec("y", 32) s = z3.Solver() s.add( ( 1 + y ) / 2 >= y ) if(s.check() == z3.sat): model = s.model() print(model) We construct a “bit vector” spanning 32 bits to represent our integer. The z3 library considers such a number by default as a signed integer going from -2147483648 to 2147483647. We also check the reverse inequality, because we are looking for a counterexample. Running the above script, Python prints back the integer 2863038463. How can this…

https://lemire.me/blog/2021/11/11/checking-simple-equations-or-inequalities-with-z3/
Science and Technology links (Novembre 13rd 2021)

Pacific rougheye rockfish can live hundreds of years while other rockfish barely live past ten years. Female condors can reproduce without males. The phenomenon is known as parthenogenesis and it occurs in birds such as chickens. It does not happen in mammals naturally as far as we know. Chimpanzees can thrive in virtual reality. People have a good opinion of their own environmental impact and they are biased against the behaviour of others. In other words, we are environmental hypocrites. A majority of minimum wage workers are age 15 to 24 in Canada. Though we tend to view American businesses as uniquely innovative, Nordic countries in Europe may be just as innovative, albeit with a small population. This could be explained in part by the fact that Nordic countries have advantageous tax regimes for businesses. Smartphone vendors like to scan our content for illegal pictures. From databases of pictures, they seek to detect pictures that you own that might be visually similar. Unfortunately, it appears that the current breed of algorithms are easily fooled. Eating several eggs a day might…

https://lemire.me/blog/2021/11/13/science-and-technology-links-novembre-13rd-2021/
Converting integers to fix-digit representations quickly

It is tricky to convert integers into strings because the number of characters can vary according to the amplitude of the integer. The integer ‘1’ requires a single character whereas the integer ‘100’ requires three characters. So a solution might possible need a hard-to-predict branch. Let us simplify the problem. Imagine that you want to serialize integers to fixed-digit strings. Thus you may want to convert 16-digit integers (up to 10000000000000000) to exactly 16 digits, including leading zeros if needed. In this manner, it is easy to write code that contains only trivial branches. The simplest approach could be a character-by-character routine where I use the fact that the character ‘0’ in ASCII is just 0x30 (in hexadecimal): void to_string_backlinear(uint64_t x, char *out) { for(int z = 0; z < 16; z++) { out[15-z] = (x % 10) + 0x30; x /= 10; } } It is somewhat strange to write the characters backward, starting from the less significant digit. You can try to go forward, but it is a bit trickier. Here is one ugly approach that is probably…

https://lemire.me/blog/2021/11/18/converting-integers-to-fix-digit-representations-quickly/
Are tenured professors more likely to speak freely?

University professors often have robust job security after a time: they receive tenure. It means that they usually do not have to worry about applying for a new job after a few years. Tenure is not available in all countries. Countries like Australia reassess positions every few years. So why does it exist where it does? One of the justifications for tenure is that professors who have tenure can speak more freely. Thus, in theory, they can be critical of government or corporate policies. Do they? What would “speaking freely” entails? What about denouncing a colleague who commits blatant fraud? On this front, the evidence is not great. Diederik Stapel published well over 100 research papers in prestigious journals. He was fired when it was determined that he was making up all of his research data. It took outsiders (students) to report him. Harvard professor Marc Hauser published over 200 papers in the best journals, making up data as he went. It took naive students to report the fraud. We find too many examples of over fraud in science, and rarely do we find…

https://lemire.me/blog/2021/11/27/are-tenured-professors-more-likely-to-speak-freely/
Science and Technology links (Novembre 28th 2021)

Government-funded research is getting more political and less diverse: The frequency of documents containing highly politicized terms has been increasing consistently over the last three decades. The most politicized field is Education & Human Resources. The least are Mathematical & Physical Sciences and Computer & Information Science & Engineering, although even they are significantly more politicized than any field was in 1990. At the same time, abstracts have been becoming more similar to each other over time. Taken together, the results imply that there has been a politicization of scientific funding in the US in recent years and a decrease in the diversity of ideas supported. Parabiosis is the process of tying the blood vessels of animals. Zhang et al. proceeded with parabiosis between young and old mice, followed by a detachment period. The old mice lived longer than control mice and they appear to have been rejuvenated by the parabiosis. A single injection to enable paralyzed mice to walk again. The Artic has been warming for much longer than we thought. Dog fed only once daily are healthier. India fertility…

https://lemire.me/blog/2021/11/28/science-and-technology-links-novembre-28th-2021/
Science and Technology links (Novembre 28th 2021)

Government-funded research is getting more political and less diverse: The frequency of documents containing highly politicized terms has been increasing consistently over the last three decades. The most politicized field is Education & Human Resources. The least are Mathematical & Physical Sciences and Computer & Information Science & Engineering, although even they are significantly more politicized than any field was in 1990. At the same time, abstracts have been becoming more similar to each other over time. Taken together, the results imply that there has been a politicization of scientific funding in the US in recent years and a decrease in the diversity of ideas supported. Parabiosis is the process of tying the blood vessels of animals. Zhang et al. proceeded with parabiosis between young and old mice, followed by a detachment period. The old mice lived longer than control mice and they appear to have been rejuvenated by the parabiosis. A single injection to enable paralyzed mice to walk again. The Artic has been warming for much longer than we thought. Dog fed only once daily are healthier. India fertility…

https://lemire.me/blog/2021/11/28/science-and-technology-links-novembre-28th-2021/
Can you safely parse a double when you need a float?

In C as well as many other programming languages, we have 32-bit and 64-bit floating-point numbers. They are often referred to as float and double. Most of systems today follow the IEEE 754 standard which means that you can get consistent results across programming languages and operating systems. Hence, it does not matter very much if you implement your software in C++ under Linux whereas someone else implements it in C# under Windows: if you both have recent systems, you can expect identical numerical outcomes. When you are reading these numbers from a string, there are distinct functions. In C, you have strtof and strtod. One parses a string to a float and the other function parses it to a double. At a glance, it seems redundant. Why not just parse your string to a double value and cast it back to a float, if needed? Of course, that would be slightly more expensive. But, importantly, it is also gives incorrect results in the sense that it is not equivalent to parsing directly to a float. In other words, these functions…

https://lemire.me/blog/2021/11/30/can-you-safely-parse-a-double-and-then-cast-to-a-float/